Friday, July 29, 2011

New SOAP Functions in SAS 9.3

SAS 9.3 includes several functions for SOAP calls, the most basic of which is SOAPWEB. SOAP is a web service protocol that works by exchanging XML. A web service, in SOAP, is similar to a procedure or CALL routine in the way you put a question to it and it returns a potentially complex answer, but:

  • The request takes the form of an XML object, which may include data and a request for a specific kind of processing or analysis.
  • The service can be located on any web server.
  • The results are returned in the form of an XML object.

In the SOAPWEB function, you provide filerefs for the input and output XML and a URL, a web address, for the web service. There are another 11 arguments, if needed, for parameters and authentication.

The return value is a return code. It is not directly mentioned in the function documentation, but presumably it returns a positive integer code value if the Internet or the web service are not found.

There are five other SOAP functions that do essentially the same thing. Each provides a slightly different approach to authentication.

Sunday, July 24, 2011

Checkpoint and Restart Modes

SAS 9.3 lets you divide a program into segments, each of which consists of one or more steps, so that you can automatically restart the program where it left off. This is useful for large-scale programs that may occasionally run into trouble because of their extensive use of system resources.

To simplify somewhat, you divide a program into segments by writing statement labels before the first DATA or PROC statement and selected DATA and PROC statements that follow, plus one more at the end of the program. These labeled statements are the points where the program will resume when you restart it after it fails. The reason the program is able to restart is that SAS saves information about the state of the session in a checkpoint library, whenever the program gets to the next checkpoint without error.

This is all controlled by system options: LABELCHKPT to run a program using checkpoints; LABELCHKPTLIB= to select a checkpoint library other than the WORK library; and if a program fails, LABELRESTART to restart a program at the last checkpoint before the point of failure. SAS also recommends the NOWORKINIT, NOWORKTERM, ERRORCHECK, and ERRORABEND system options to preserve the WORK library and collect valid checkpoint data. If you are running in an environment where it isn’t practical to preserve the WORK library, rewrite the program to use another library instead.

I couldn’t count the number of times I’ve restarted a program manually by erasing the beginning steps in a program, then adding in the appropriate OPTIONS and TITLE statements, in order to rerun the end of it. That approach fails so often, though, that I often prefer to restart a program from the beginning, despite the use of computer resources. The checkpoint and restart modes take the guesswork out of restarting a program, so that you can restart somewhere in the middle without having to create a modified version the program.

If you don’t know where a program might fail, you might prefer to use step checkpoints rather than labeled checkpoints. In this mode, SAS automatically creates a checkpoint for every step. Use the system options STEPCHKPT, STEPCHKPTLIB, and STEPRESTART.

If there is a step that has to run every time, even in restart mode, mark the step with this statement, before the DATA or PROC statement:

CHECKPOINT EXECUTE_ALWAYS;

With this statement, the step runs every time the program runs, regardless of restart mode.

Thursday, July 21, 2011

The HHMMSS Informat

The HHMMSS informat, new in SAS 9.3, reads time fields in hours, minutes, and seconds that do not contain punctuation. The data value can represent either a time of day or a duration. This informat is especially useful for fields that do not always contain minutes and seconds. These are examples of valid time of day fields for the HHMMSS informat:

09
0944
094401

With more than six digits in a field, the extra digits on the left are treated as part of the hours — useful if data values may go beyond 99 hours. The HHMMSS informat also reads fields that contain punctuation, and for these fields, it works the same as the TIME informat.

HHMMSS is the only new general-purpose informat in SAS 9.3. No new general-purpose formats were introduced. There are three new formats of note, which are for locale-based formatting of SAS datetime values including time zone.

The new documentation clarifies that 5-digit year numbers, though supported by some informats and formats, are considered an undocumented feature in SAS.

Wednesday, July 20, 2011

SAS 9.3 Macro Language Features for Context

SAS 9.3 adds macro functions, macro statements, and automatic macro variables to allow macro language programming to have more contextual awareness.

The new macro functions and macro statements have to do with macros in the current session. The %SYSMACEXIST and %SYSMACEXEC functions search for a macro by name, telling you whether the macro has been defined in the current session and whether it is currently executing. The %SYSMEXECDEPTH and %SYSMEXECNAME functions allow you to discover the names of all currently executing macros.

The %SYSMACDELETE statement deletes macros from the current session. The %SYSMSTORECLEAR statement closes the stored compiled macro library. Both statements have to be used with care; you cannot delete a currently running macro, nor can you deassign a stored compiled macro library if one if its macros is running.

The new automatic macro variables allow you to write SAS code that adjusts for the sizes of objects, along with a few related considerations. SYSSIZEOFLONG, SYSADDRBITS, SYSSIZEOFPTR, and SYSSIZEOFUNICODE indicate the size of a long integer, address, pointer, and Unicode character, respectively. SYSENDIAN provides the byte sequence of numeric values.

Another new automatic macro variable, SYSODSESCAPECHAR, tells you the ODS escape character. Using this macro variable, you can add the ODS escape character to a string, or check for (and encode) escape character conflicts in character data, without having to know what the escape character is.

Tuesday, July 19, 2011

HTML in SAS 9.3

With SAS 9.3, it’s pretty clear that SAS wants everyone to start using HTML as their base output format. That’s actually been a good idea since SAS 9 was first released. HTML output is easier to read, takes less paper if you print it, and is easier to deliver to the world or to integrate into other documents than the “print file,” the paginated monospace text of the Listing destination format that SAS relied on for the previous four decades. And though HTML might seem more complicated than Listing, it’s not actually any harder to create. You will find a series of changes in SAS 9.3 that make the transition easier.

  1. HTML is now the default destination if you are running SAS in the GUI windowing environment. That is, you don’t have to use any ODS statements to get HTML output.
  2. ODS graphics have been separated from SAS/GRAPH and moved entirely to base SAS. ODS graphics are now included in your HTML output with just a base SAS license. If you use the Listing destination, you can still use ODS graphics, but you have to look at the graphics separately, an approach we will all quickly come to see as an unnecessary inconvenience.
  3. There are improvements in the DOCUMENT procedure to make it more compatible with the PRINT procedure, and to allow arbitrary text to be added to a document. These changes are especially useful if you are using the DOCUMENT procedure for formatted output such as HTML output.
  4. There was a sense that ODS style attributes were converging with CSS attributes when SAS 9 was released, and that continues with several new ODS style attributes in SAS 9.3. The new attributes such as WHITESPACE, PADDING, and BORDERCOLLAPSE provide ODS support for attributes that you would take for granted in CSS formatting of an HTML document.
  5. There is a new style, HTMLBlue, that enhances readability and makes more efficient use of space when displaying tables. It also, in my opinion, just looks more glamorous in a graphic design sense. The HTMLBlue style is the default style for the HTML destination in SAS 9.3.

There are two tricky points with HTML being the default destination. First, if you were already running a program in SAS 9.2, you may want to keep its output style the same as before. There are three new system options that, used together, can provide the SAS 9.2 behavior, so that you don’t have to rewrite your programs to keep them from changing.

Second, while HTML is the default destination in the GUI windowing environment, Listing remains the default in all other environments. This means the output format could change just because you run a program in a different way. If this is a problem, add ODS statements to the program to explicitly select the destination you want and close the destination you don’t want. This is not as big a change as it might sound. You can start a program with this statement to close all ODS destinations, without having to know which ones are open:

ods _all_ close;

Follow this with an ODS HTML, ODS LISTING, or other ODS statement to open the destination of your choice. This is all it takes to get consistent ODS output from a program, not affected by where you run it.

If you have older programs that use the PRINTTO procedure to select a destination file for output, you need to know that the PRINTTO procedure works only for the Listing destination, and not for any other ODS destination. Even for the Listing destination, it is simpler to select destination files using the ODS LISTING statement. However, continue to use the PRINTTO procedure if you use it to temporarily reroute the SAS log, or use it with the Listing destination to combine the log and output in the same file.

Saturday, July 16, 2011

Breaking Up the Dictionary

The last time we saw it, SAS Language Reference: Dictionary was 2,225 pages. Two of its chapters accounted for half the length of the book: Chapter 4, “Functions and CALL Routines,” provided 920 pages, and Chapter 7, “SAS System Options,” 254 more. If the SAS documentation team had simply expanded this book for SAS 9.3, it would have been nearly 3,000 pages in length — perhaps a little too much for a single book.

book cover

That explains the need for seven of the new books introduced for SAS 9.3. The two chapters I mentioned have turned into SAS Functions and CALL Routines: Reference (1,027 pages) and SAS System Options: Reference (333 pages). Other volumes cover statements, data set options, and formats and informats. Part II of the Dictionary has been replaced with a separate volume on component objects. Appendix 1, “DATA Step Debugger,” has become the core of a new book, Base SAS Utilities: Reference.

This is the most prominent change in the shape of the documentation, but there are others, as specific features are moved from formal to topical locations. For example, the OPTIONS, OPTLOAD, and OPTSAVE procedures, which have to do with system options, are now covered in SAS System Options: Reference. (The same chapters are repeated in Base SAS Procedures Guide, but just, I imagine, one last time, to give readers a chance to get used to looking for procedures in their new locations. Base SAS Procedures Guide weighs in at 1,794 pages, while some 3,500 pages of documentation for other base SAS procedures are found in other books.)

Thursday, July 14, 2011

Names, and Moving Away From Display Manager

The rules for SAS names get a little more expansive in SAS 9.3. We have long been able to use arbitrary characters and accented letters in variable names and in the names of some SQL tables. The characters of Asian scripts are also available with the right character encoding. Now the same expanded character set applies to the names of SAS data sets and item stores — but with a few cautions.

Names that aren’t traditional SAS names have to be written as name literals — a quoted string marked with the letter N for name. System options can permit or prohibit the extended use of characters in names. The option is VALIDVARNAME=ANY to allow additional characters in variable names, VALIDVARNAME=V7 to prohibit characters other than ASCII letters, digits, and underscore. In SAS 9.3, the system option VALIDMEMNAME=EXTEND allows special characters in member names. Use VALIDMEMNAME=COMPATIBLE to limit yourself to member names that will work in SAS 9.2 and older versions.

The use of special characters is limited to SAS data sets (including views) and item stores. Catalogs and other member types are still limited to traditional SAS names.

These punctuation characters cannot be used in SAS file names:

/ \ * ? " < > | : -

The null character is also not permitted.

There are restrictions on the use of these characters (so don’t use them if you don’t have to):

% & . # $

Spaces at the beginning or end of a name literal are not included in the member name. Also, like all SAS names, member names are case-insensitive, even when you write them as name literals.

A member name can’t be more than 32 bytes long. Depending on the character encoding, you could be limited to as few as 8 characters.

Here is the big limitation on extended member names, though: they can’t necessarily be used in the interactive environment. From SAS 9.3 System Options: Reference:

The windowing environment supports the extended rules in the Editor, Log, and Output windows when VALIDMEMNAME=EXTEND is set. In most SAS windows, these extended rules are not supported. For example, these rules are not supported in SAS Explorer, the VIEWTABLE window, and windows that you open using the Solutions menu.

The list of incompatibilities attached to the legacy display manager and full-screen windows of the SAS interactive environment is growing, not shrinking, with the new release. Given the issues with names and character encodings, along with other concerns, SAS is marginalizing its interactive environment. I don’t think that means SAS wants us all to switch to batch mode, though that will be necessary for many applications in SAS 9.3. I believe SAS remains committed to interactive users, so it may have an eye toward rebuilding its entire user interface or replacing it with something more contemporary and perhaps more flexible and modular in a future release.

Wednesday, July 13, 2011

New SAS Log and Logging Features in SAS 9.3

First things first — what’s new in the SAS log? The log in SAS 9.3 is basically the same as what we have seen since SAS 7, but there is one new feature of note.

The RESETLINE statement, a global statement, resets the program line numbers that appear with program lines in the log. The RESETLINE statement does not affect the line number of the program line it appears on, but the line numbers start over at 1 beginning with the next program line. You might use the RESETLINE statement at the beginning of a SAS program so that the program lines in the log correspond to specific places in the program. Another place where you might use the RESETLINE statement is at the end of an autoexec file.

Separately, there are extensive changes in the logging facility. It was introduced in SAS 9.2 but comes into its own in SAS 9.3. The logging facility is a cleaner way to generate log messages and deliver them, not particularly to the SAS log, but to external destinations that now may include “third-party databases and Java classes.” A log event indicates a logger (category), along with a log message (text) and a message level (or threshold), such as info or error. You can use functions, component objects, or macro language to create log events. Filters in the logging facility determine which messages go to which destinations.

SAS programs have always been able to write text in the log as a message. The idea here, though, is to have an object-oriented approach to messages so that selected messages can be delivered in particular ways. This could include adding messages to web pages along with the program’s results. Apparently it could also include triggering system-level scripts that respond to specific events or problems that may occur in the execution of a SAS program. As an example, a logging event could result in a SAS program being rerun later if the program fails because of a certain kind of database error. It’s easy to imagine a program for which you would want this to happen automatically, instead of having a person note the failure, read the SAS log, identify the point of failure, and manually relaunch the SAS program.

Tuesday, July 12, 2011

SAS 9.3 Is Available

SAS announced the availability of the newest SAS release, SAS 9.3, early today. Right out of the box, SAS 9.3 shows substantial changes that extend SAS’s reach, but that in some ways represent a break from the past. SAS users generally will need to be aware of the major new features even if they are not using them immediately, in order to navigate the changes successfully.

This blog will consider new features in SAS 9.3 as I come upon them. At the same time, I will be revising my books Professional SAS Programmer’s Pocket Reference, Routine SAS, Professional SAS Programming Shortcuts, and Professional SAS Programming Secrets, along with the Global Statements web site, to bring them up to date with the latest SAS features. I will be writing about my progress and difficulties in that work and asking for suggestions and priorities for each book. Given its topic, this blog is not meant to be open-ended. If all goes well, it may have run its course in a few months.

The important disclaimers: I am not affiliated with SAS and have no inside knowledge, so everything here is based on published information. SAS and other SAS product names are registered trademarks of SAS Institute Inc. in the United States and other countries. If you are interested in licensing SAS, please contact SAS for further information. We are all taking our first look at SAS 9.3 at this point, so some of my ideas here may be untested and speculative, and I may change my mind as I go along. If something I say doesn’t seem quite right, please take the time to comment. Thanks for reading!