Topics Map > Research Computing Support > Linstat
Linstat is the SSCC's primary Linux computing cluster. Linstat combines familiar statistical software like Stata, SAS, R, and Matlab with the power of Linux, making it ideal for jobs that require more memory or computing time than Winstat can provide. Linstat also gives you access to the SSCC's HTCondor flock, where you can run multiple jobs at the same time.
Learning to run jobs on a Linux server is probably easier than you think. If you're new to Linux, be sure to read the section Getting Started on Linstat. Veteran Linux users can probably stop reading when they reach that point, but should be sure to read the sections before that which describe some of the unique features of Linstat.
To log in to Linstat you'll use your SSCC username (typed in lower case) and password. If you've forgotten your password, you can reset it here.
If you are outside the United States please read Connecting to Linstat from Outside the United States.
How you'll connect to Linstat depends on what kind of computer you're connecting from:
Windows PCs or Winstat
If your computer runs Windows, we suggest you connect using a program called X-Win32 (though there are many fine alternatives). X-Win32 is already installed and configured on Winstat, so one option is to log in to Winstat and run X-Win32 there. Alternatively, you can download and install a pre-configured version of X-Win32 from the SSCC web site. Simply download the installation file and then double-click on it.
You'll be asked to log in because X-Win32 is only licensed for UW faculy, staff, and students. Just give your usual SSCC username and password. To use it you'll need to first connect to the SSCC network using VPN.
When you run X-Win32 it will place an icon in the lower right corner of your screen:
Click on the icon once and choose Linstat.
For more details, including how to set up a connection to a particular Linstat server, see Connecting to SSCC Linux Computers using X-Win32.
If you are not on the UW-Madison campus you must establish a VPN connection to campus before using X-Win32.
Macs or Computers running Linux
Macs and Linux computers have client programs for connecting to Linux servers installed by default. Simply start a Terminal program (on a Mac it will be found under Applications, Utilities) and then type:
ssh -Y firstname.lastname@example.org
username should be replaced by your SSCC username. If your username on your computer is the same as your SSCC username, you can leave it out (ssh -Y linstat.ssc.wisc.edu). If you are plugged into the wired network in the Sewell Social Sciences Building you can leave out the domain (ssh -Y linstat).
For more details, including how to connect to a particular Linstat server, see Connecting to Linstat from a Mac.
In order to display Linux graphics, including graphical user interfaces for Stata, Matlab, and other programs, Macs need to have an X windows program like XQuartz installed.
When you connect to Linstat, you'll be directed to the least busy of the four Linstat servers (linstat1, linstat2, linstat3, and linstat4) automatically. This will spread users among the four servers and help avoid situations where one server is much busier than another.
If you are running a long job and need to connect to the same server again to monitor it, log in to Linstat and then type ssh server, where server should be replaced by the name of the server where you started the job. Be sure to note which server you're on when you start a long job. Most people have the server name in their prompt, but if you don't you can find out which server you're using by typing printenv HOST. It's also possible to connect to a specific server directly—the links in the previous section have instructions.
If you run a program in batch mode, you can log out and the program will continue to run. Putting an ampersand (&) at the end of a command will put it in batch mode. However, many programs need additional settings to work in batch mode, such as not starting a graphical user interface. These program-specific settings are described below when we talk about running programs.
/ramdisk is a special "directory" that is actually stored in RAM, making it extremely fast. The maximum size of /ramdisk is 32GB, and any files that are not in use will be deleted after one hour. The /ramdisk directory can be very helpful for programs that spend a lot of time reading and writing temporary files.
We have a small number of Stata MP32 licenses, which are ideal for running computationally intensive do files. Do files run using the stata and condor_stata commands will be run using Stata MP32, though some HTCondor servers only have 8 cores and Stata MP32 will automatically adapt accordingly.
On Linstat, the default directory where SAS stores temporary data sets (the WORK library) is /ramdisk. This increases the speed of data-intensive programs significantly. It also prevents them from slowing down the entire server due to disk I/O bottlenecks.
If you need more than 32GB of temporary space, change the WORK directory to /tmp. You can do so by adding the -work option to your SAS command:
sas -work /tmp myprogram
You'll then be able to use up to 243GB of space (or as much of it as is available at the time). For more details see Running Large SAS Jobs on Linstat.
The SSCC's HTCondor flock contains 136 CPUs and is ideal for running multiple jobs at the same time. HTCondor can run Stata, SAS, Matlab, and R jobs as well as user-written programs. We've written scripts that make submitting jobs to HTCondor very easy—see An Introduction to HTCondor for instructions. (You can also submit Stata jobs to HTCondor flock via the web.)
Due to licensing restrictions, Mplus is only installed on Linstat1, Linstat2, and Linstat3, and may only run one job at a time on each server. Because of the unusual way Mplus launches additional terminal sessions you'll need to stay logged in the entire time the program is running. Running Mplus on Linstat has more details.
Linux can be intimidating because it just waits for you to type commands without giving you any menus or icons to suggest what you can do. But if all you want to do is run jobs, you can get by with just a couple of Linux commands. Here's how:
Get your job ready using your computer
If you're on Winstat or a Windows PC that logs into the SSCC's PRIMO domain, your Linux home directory is available as the Z: drive, and Linux project directories are the V: drive. They're also available from Macs—see Using SSCC Network Disk Space from Macs. This means you can write your program, manage your files, etc. using the tools you're familiar with and still put the programs and related files on the Linux file system so Linstat can run them.
Put all the files relating to a given project in a single folder (or directory in Linux terminology), then write your programs on the assumption that that folder will be your working directory (i.e. a Stata program should say use datafile, not use z:\research\datafile). If you're only working on a single project then just declare Z: itself that project's "directory."
Command #1: cd
When you log into Linux, your "working directory" (where you "are" in the file system) starts out as your home directory—what Windows calls Z:. If that's where your project's files are, you can skip directly to running your job. Otherwise you'll need to go to your project's directory using the cd ("change directory") command. If your project's directory is on your Z: drive, type:
Where myProject should be replaced by the name you gave your project's directory.
If your project's directory is inside an official Linux project directory on the V: drive, type:
A few more points on the Linux file system:
- Directories are separated using the forward slash (/) rather than the backslash (\).
- There are no drives or drive letters in Linux. All directories are part of a single tree structure with the "root" of the tree denoted by a slash (/).
- If a directory path starts with a slash (/), it starts from the root (it is an "absolute" path). Thus cd /project means "go to the root directory, then to project underneath that"
- If a directory path does not start with a slash, it is assumed to start with the current directory and go from there (it is a "relative" path). Thus cd myProject means "go to the myProject directory under the current directory."
- Linux does not like spaces in file or directory names (you have to put the whole path in quotes if it includes a space)
- Unlike Windows, Linux is case-sensitive. File and file are two different files.
Command #2: Run Your Program
The command to run your program will depend on the program you want to use. Here are some of the most popular:
You can start Stata's graphical user interface by typing xstata. You can also run a do file called mydofile.do in batch mode by typing:
stata -b do mydofile &
Alternatively you can submit it to HTCondor with:
condor_stata mydofile &
If you run mydofile.do in batch mode or on HTCondor, Stata will automatically log its output in mydofile.log.
You can start SAS's graphical user interface by typing sas, though it's somewhat clunkier than the Windows version. You can also run a program called myprogram.sas in batch mode by typing:
sas myprogram &
To run R, simply type R. It does not have a graphical user interface but the commands are the same as in Windows R or RStudio.
To run an R program in batch mode, type:
R < myprogram.R > myprogram.log --no-save &
To submit myprogram.R to HTCondor and save the output to myprogram.log, type:
condor_R my program.R myprogram.log
If your job uses multiple processors, type:
condormp_R program.R program.log &
You can start Matlab's graphical user interface by typing matlab. To run a Matlab program myprogram.m in the background and save its output in myprogram.log, type:
matlab -nodisplay -nojvm < mprogram.m > myprogram.log &
To submit myprogram.m to HTCondor and save its output in myprogram.log, type:
condor_matlab myprogram.m myprogram.log &
If your job uses multiple processors, type:
condormp_matlab program.m program.log &
To run an Mplus job, log into Linstat1, Linstat2, or Linstat3, and type:
mplus myprogram.inp &
where myprogram.inp should be replaced by the name of the Mplus program (the .inp file) you want to run.
Linstat has many other programs available (see our software database). See the documentation of the program you're interested in for details on how to run it.
While this will get you started, there are several other SSCC Knowledge Base articles you can read to become a more flexible and efficient Linstat user. Managing Jobs on Linstat will teach you how to monitor and manage jobs while they run. An Introduction to HTCondor will teach you more about the SSCC's HTCondor flock and how to use it. Finally, if you really want to make yourself at home in Linux, read the SSCC's Getting Started in Linux. For a full list of articles, visit the Linux section of our Knowledge Base. SSCC staff will also be happy to answer any questions you have about using Linstat and help you solve any problems you run into—just contact the Help Desk.