Use Proc Import to Read Sas Dataset

SAS icon

OVERVIEW

In UNIX, PROC IMPORT and PROC Consign must exist executed under the X window. Otherwise, y'all take to add "noterminal" to a control (e.grand., "sas -noterminal file_name.sas").

In SAS, there are various data sources equally shown in the post-obit effigy. In general, SAS reads data using the INFILE statement and PROC IMPORT. You may use the PUT statement in a Information Step or PROC EXPORT to export data sets into external files. INFILE should be used in a Data step, while PROC IMPORT and PROC Consign are contained procedures.

Data Source in SAS

The INFILE statement reads information directly using the DATALINES (CARDS) statement, imports various ASCII text files, and imports data sets through network (i.east., FTP and HTTP). INFILE and DATALINES also read data in a matrix form. PROC IMPORT reads ASCII text files, database (Access, dBASE), and spreadsheet (Lotus 1-2-3, Excel).

If you have uncomplicated data, read them using INFILE and DATALINES; otherwise, employ PROC IMPORT. Personally, I prefer the IMPORT Wizard to PROC IMPORT due to its convenient interface and flexibility. If you have a data prepare generated in other software packages (due east.g., Excel, dBASE III, Paradox, Stata, and SPSS), use a data conversion utility like Stat/Transfer.

SAS icon

INFILE Statement

The INFILE argument identifies external files to exist read in the INPUT argument. It reads an ASCII text file that is typically delimited past a space (default), Tab, comma, or other delimiters. Allow us take the post-obit case first.

LIBNAME js 'c:\sas';
FILENAME egov 'c:\sas\egov.txt';

DATA js.seoul;
INFILE egov LRECL=250 FIRSTOBS=7 OBS=700;
INPUT proper name $ male person training decadent negative;
RUN;

  • The LIBNAME statement designate a library, an alias of the drove of information sets, to the specified directory (c:\sas).
  • The FILENAME statement assembly a file reference with a external file (drive+path+filename). Without the argument, you should explicitly specify the drive, paths, and file proper noun in the INFILE statement as "INFILE 'c:\sas\egov.txt';
  • The "DATA js.seoul" creates a SAS data set "seoul" in the "js" library. The upshot of this Data step is stored into the "seoul.sas7bdat" in the c:\sas\.
  • The INFILE statement specifies the file to be read. You may add options, such every bit LRECL=north FIRSTOBS=n OBS=n, if necessary.
  • The "LRECL=250", the logical record length, tells SAS to read a line upwardly to 250 columns.
  • The "FIRSTOBS=7" tells SAS to read information from the seventh line; The start vi lines are ignored.
  • The "OBS=700" tells SAS to read offset 700 observations. Omitting this option reads all observations in a information file.
  • The INPUT statement specifies a list of variable name, type, and length depending on input styles.

SAS icon

INFILE OPTIONS

  • DSD (delimiter-sensitive data) tells SAS to treats two consecutive delimiters as a missing value and removes quotation marks from character values.
  • FIRSTOBS specifies a record number that SAS begin reading input data records.
  • FLOWOVER (default) reads the next data line if in that location is no plenty data in the current data line for all variables specified. This method may be problematic when at that place is missing values.
  • LRECL (logical tape length) specifies the logical record length. This choice is non valid when the DATALINES is specified.
  • MISSOVER sets all remaining variables without values to missing. And so read the next line for new observations.
  • STOPOVER stops a DATA step when INPUT does non find values for all the variables specified. This choice is useful when nosotros demand to cheque missing values in raw data.
  • TRUNCOVER does not read a new data line when INPUT does not find values in the current information lines for all variables specified. This option sets all remaining variables without values to missing.

Information js.miss;
INFILE DATALINES MISSOVER;
INPUT id korean english;

DATALINES;
01 87 65
02 86
03 95
04
RUN;

The above example reads 4 observations with missing. Note that "english language" of the 2d observation is missing since consecutive blanks are interpreted equally a delimiter. If MISSOVER option is omitted, SAS reads only two observations with data massed up; the "english language" of the second observations is prepare "03". If STOPOVER option is used instead of MISSOVER, SAS will cease at the second data line since there is not enough data values for variables specified.

SAS icon

IMPORT & Export PROCEDURES

Since the version eight.0, SAS provides the IMPORT procedure to read various types of external files, such as ACESS, EXCEL, and dBASE 3. In previous versions, the DBF and DIF procedures bargain with dBASE 3 and spreadsheet files, respectively.

The IMPORT procedure creates data sets from various types of external files. Then an additional DATA stride is not necessary unless the data set up needs to be manipulated. In other words, the IMPORT process do non require the INPUT and DATALINES (or CARDS) statements.

There are ii ways of using the IMPORT procedure. One is to write a IMPORT statement in a standard SAS programme, the other is to employ the IMPORT/Consign Wizard which is available under the FILE bill of fare. Personally, I prefer the IMPORT Wizard to the procedure, since the former provides a user-friendly interface, high flexibility, and other useful features.

Is the IMPORT statement e'er better than the INFILE? It depends. For case, when you need to read a huge ACCESS file (say larger than 1GB), just only several variables are needed for analyses, the INFILE statement is recommended.

SAS icon

IMPORT & EXPORT WIZARD

The Import/Export Wizard provides extremely flexible ways of reading external data. The wizard works in the graphic user interface (GUI) surround such as X-window and Microsoft Windows. If your data file is messy and ill-organized, the wizard will be a good solution.

The Import/Export Wizard actually works in the same way that the Import and Export procedures in a SAS plan exercise; The Wizard can generate SAS programs for Import and Export procedures.

SAS Import Wizard

In order to use the Wizard,

  1. Click "File" from the top menu bar and choose "Import Data..."
  2. Cull a file blazon from the list. You may choose "User-defined formats" for messy data.
  3. Locate an external file to be imported.
  4. Provide a library and SAS information set name.
  5. If yous want SAS to create a SAS program file that is equivalent to what the Wizard does, specify the file proper name.
SAS Import Wizard
SAS icon

READ Directly Data ENTRY

You can directly input data in a Data Step. INFILE DATALINES tells SAS to construct a data set from information lines after the DATALINES statement. The post-obit instance reads a numeric variable id, a string variable depart, and a numeric variable price. Note that data items are delimited (separated) past a space. In this case of the space-delimited format, INFILE DATALINES can be omitted.

DATA fruit;
INFILE DATALINES;
INPUT id depart $ employee;

DATALINES;
1 sales 16
2 finance 25
3 inquiry x
RUN;

SAS icon

IMPORT & Export ASCII TEXT FILES

The in a higher place example implies that ASCII text data (egov.txt) are delimited with space (or bare). If a ASCII file is delimited past other than a space, y'all need to explicitly specify delimiter using DELIMITER (or DLM) option with using various characters or combinations of characters every bit a delimiter. For case, DLM='09'10 for Tab, DLM='&' for ampersand , DLM='^' for caret, DLM=';' for semicolon, DLM='END', and so forth.

Information js.seoul;
INFILE egov.txt DELIMITER=',';
INPUT proper noun $ male training corrupt negative;
RUN;

The INFILE is also able to read the data stream of DATALINES (or CARDS). You may specify DATALINES instead of an external file proper noun. Note that each data element in the following instance is separated past caret (^).

DATA js.fruit;
INFILE DATALINES DELIMITER='^';
INPUT id fruit $ sales;

DATALINES;
1^Grape^100
2^Pear^77
RUN;

What if a comma is used as a delimiter and there is a missing value in the second information line? The DSD pick is the answer. DSD reads a value every bit missing between two consecutive delimiters (,,). Without this option, the higher up plan reads only first observation.

DATA fruit;
INFILE DATALINES DLM=',' DSD;
INPUT id fruit $ sales;

DATALINES;
one,Grape,100
2,,77
RUN;

  • "DATA fruit" does non specify a library, thus the data set "fruit" is stored in the "piece of work" library, which is the default library existing on the RAM only.
  • Since the DSD option sets a comma as the default delimiter, DLM=',' is not necessary in this example. But if other delimiter is used, both DLM and DSD selection should be used to read missing values correctly.

If yous want to export a information fix into an external file, utilize PUT statement. PUT allows yous to control the output format flexibly, but information technology is a bit difficult, peculiarly for beginners, to use this statement correctly.

Now, permit us use IMPORT and EXPORT procedures to handle ASCII text files. You need to specify delimiter of the ASCII text file in the DBMS choice.

PROC IMPORT DATAFILE="c:\sas\ego.csv" OUT=jeeshim.egov DBMS=CSV Replace;
GETNAMES=YES;
DATAROW=two;
RUN;

  • The DATAFILE specifies an external file to be imported.
  • The OUT specifies the SAS data file to exist created.
  • The DBMS specifies the blazon of the external file. For case,"DBMS=DLM", "DBMS=CSV", and "DBMS=TAB".
  • The Supervene upon overwrites an existing file, if whatever.
  • The GETNAMES reads variable names from the first line of the information file.
  • The DATAROW pick tells the row from which SAS reads observations.
  • GETNAMES and DATAROW crave semicolon at the end.

The following example exports a data prepare to a infinite delimited ASCII text file. The DATA and the OUTFILE respectively specify a data set to exist exported and an external file to which information are exported.

PROC EXPORT
Data=jeeshim.egov
OUTFILE="c:\sas\ego.txt"
DBMS=DLM REPLACE;
RUN;

SAS icon

IMPORT & EXPORT EXCEL FILES

PROC IMPORT imports spreadsheet files such as Excel and Lotus 1-2-3.

PROC IMPORT DATAFILE="c:\sas\avails.xls" OUT=js.pc DBMS=EXCEL2000 Supervene upon;
Sheet="computer";
GETNAMES=YES;
RUN;

  • The Sail statement specifies the worksheet to exist imported. If this statement is omitted, the worksheet will be referred to "sheet1", "sheet2", and and then on.
  • If you desire to read a subset of a worksheet, provide a starting and ending prison cell point afterwards the worksheet proper name. For case, SHEET="reckoner$B3:F100"; reads data from B3 through F100, excluding column A and row 1 and 2.
  • The EXCEL2000 indicates the Microsoft EXCEL 2000 format. Other releases include EXCEL4, EXCEL5, and EXCEL97. For LOTUS 123 files, use WK1, WK3, or WK4.
  • The GETNAMES=YES reads variable names from the first row of the worksheet. The default is YES.

The post-obit case exports a data set up to an Excel file. Observe that at that place is merely one semicolon at the stop of EXPORT statement.

PROC Export
DATA=js.egov
OUTFILE="c:\sas\egov.xls"
DBMS=EXCEL2000 Replace;
RUN;

SAS icon

IMPORT & Export dBASE Three FILES

A dBase III file has ony ane table that has a well defined information structure. You may omit the "DBMS=DBF" option since SAS tin recognized file format with the extension, "dbf."

PROC IMPORT
DATAFILE="c:\sas\comuter.dbf"
OUT=js.pc
DBMS=DBF REPLACE;
RUN;

The following example exports a data fix to a dBase III file.

PROC Export
DATA=js.pc
OUTFILE="c:\sas\pc.dbf"
DBMS=DBF REPLACE;
RUN;

SAS icon

IMPORT & Export Access FILES

Different a dBASE 3 or FoxPro file, an Access file can has more than one tabular array with security features. Thus, you take to provide database, tabular array, account (identification), and countersign to access a database file.

PROC IMPORT TABLE="comuter" OUT=js.pc DBMS=ACCESS REPLACE;
UID=""; PWD=""; WGDB="";
DATABASE="c:\sas\nugget.mdb";
RUN;

  • The DATABASE statement specifies the bulldoze, path, and access file name (e.g., "asset.mdb").
  • The Table selection specifies the table proper noun in the admission file (east.chiliad., "computer").
  • The UID statement specifies the user ID who is allowed to access the database. Similarly, the PWD specifies the password to log in the database. You may ignore them unless specific UID and PWD are designated.

The Export procedure can consign a SAS data set to a Access table. Note that the OUTTABLE specifies a table name to be generated in the Admission file.

PROC EXPORT
DATA=js.egov
OUTTABLE="egov2000"
DBMS=ACCESS2000 REPLACE;
DATABASE="c:\sas\egov.mdb";
RUN;

SAS icon

USING NETWORK RESOURCES

SAS can access data through FTP, HTTP, and TCP/IP. This network functionality provides high flexibility and convenience in the information era.

The following example uses FTP to access the data resources. Note that FTP, USER, PASS, and HOST are reserved words for protocol, business relationship name, countersign, and host estimator name, respectively. The PROMPT specifies to prompt for the user login password, if necessary.

FILENAME myftp FTP 'gov.txt' USER='kucc625' PASS='xxxxx' PROMPT
HOST='mdss.iu.edu' CD='/sas/egov';

Information js.egov;
INFILE myftp;
INPUT year land $ domain index;
RUN;

The post-obit examples use URL to access data files available on a website.

FILENAME myurl URL 'http://masil.org/archives/airline.txt';

Data js.egov;
INFILE myurl;
INPUT airline year output price fuel load;
RUN;


FILENAME myurl URL 'http://www.masil.org/archives/smoking.txt';

DATA js.smoking;
INFILE myurl FIRSTOBS=34;
INPUT state $ cigar bladder lung kidney leukemia area;
RUN;

In gild to apply the TCP/IP connection, you have to use the SOCKET instead of URL.

SAS icon

USING INFORMATS

If information include comma in numbers, dollar ($), and percentage (%), you lot have to employ informats such as COMMAn.northward, DOLLARn.due north, and PERCENTn.northward in the formatted input manner.

Data sales;
INFILE DATALINES;
INPUT year quan COMMA9.0 sales DOLLAR10.0 rate PERCENT3.;

DATALINES;
2000 184,871 $2,875,879 eighty%
2001 875,877 $five,987,972 89%
RUN;

If data are in date informats, yous may apply MMDDYY10. and MMDDYY8. formats. Annotation that the menstruum (.) in the third ascertainment indicates a missing value.

Information book;
INFILE DATALINES;
INPUT showtime MMDDYY10. end MMDDYY8.;

DATALINES;
08/12/1999 03/01/04
03/02/1987 02/25/91
03/02/1991 .
RUN;

SAS icon

READING MATRICES

SAS can as well read such matrix forms equally correlation coefficients (Type=CORR), covariance (Blazon=COV), and parameter estimators (TYPE=EST).

Data corr_mat (Type=CORR);
INFILE DATALINES MISSOVER;
matrix ='Correlation';
INPUT _NAME_ $ var1 var2 var3;

DATALINES;
var1 1.00000
var2 0.25757 1.00000
var3 0.57844 0.54865 1.00000
RUN;

  • The _TYPE_, a special SAS variable, is used to distinguish the various statistics such as _TYPE_='CORR', _TYPE='Mean', and _TYPE_'STD'.
  • The _NAME_ is needed to place the row of the correlation matrix.

SAS icon

USING CLIPBOARD

SAS can read text data from and write text data to the clipboard.

Yous may import a worksheet of an Excel file. First highlight the part of worksheet in Excel and copy it into the Windows Clipboard. Suppose you cull 5 variables.

FILENAME clipboard CLIPBRD;

Information excel;
INFILE clipboard
INPUT x1-x5;
RUN;

SAS icon

ACCESSING THROUGH ODBC

SAS can access data through the ODBC connection. You need to define a DSN for database or spreadsheet files in advance ("access_dsn").

  1. Run the "ODBC Data Source Administrator" at the Control Panel
  2. Click "Organisation DSN" and then choose "Addd" to create a new DSN
  3. Select a proper driver (mdb, xls, etc.) and then provide DSN proper noun and clarification
  4. Specify database, spreadsheet, or database server (local or SQL server name)
  5. Click "Advanced" to enter login ID and password, if necessary

The DSN is continued by the CONNECT statement in the SQL procedure. A series of SQL statements such equally CREATE and SELECT follow the CONNECT statement.

PROC SQL;;
CONNECT TO ODBC AS db_con
(DATASRC="access_dsn" USER=kucc625 PASSWORD=xxxx);
SELECT * FROM CONNECTION TO ODBC(SELECT * FROM egov2004);
...;
QUIT;

SAS icon

Information CONVERSION UTILITIES

SAS cannot directly read customized data formats such as STATA *.dta. Accordingly, yous take to consign a data set in the software into a full general file format (eastward.k., *.csv) and and then import it into a SAS information set. Or you lot may apply the COPY procedure to create a ship file that tin exist recognized past other software.

Still, these tasks ofttimes become cumbersome and burdensome to well-nigh researchers. The nigh efficient solution in this case is to use professional information transferring utilities such equally Stat/Transfer and DBMS/COPY that transfer data from one software to some other.

These utilities support a variety of file formats such as SAS, Stata, Gauss, Rats, Admission, dBASE, FoxPro, Paradox, Lotus i-2-3, Quattro, Excel, Sigmaplot, Minitab, and SPSS.

SAS icon

REFERENCES

  • SAS Institute. 2005. SAS Language Reference: Concepts, Version nine. Cary, NC: SAS Found.
  • SAS Establish. 2005. SAS Language Reference: Dictionary, 2nd ed., Version 9, Volumn three. Cary, NC: SAS Institute.
  • Burlew, Michele K. 2002. Reading External Data Files Using SAS: Examples Handbook. Cary, NC: SAS Institute.

karpinskiwhatife.blogspot.com

Source: https://www.iuj.ac.jp/faculty/kucc625/sas/import.html

0 Response to "Use Proc Import to Read Sas Dataset"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel