Friday, October 21, 2011

Books at SESUG 2011

The SESUG conference gets underway in Alexandria, Virginia, this weekend. If you are attending, look for the book display in the SAS Demo Room starting Monday morning. Also in the SAS Demo Room, take your coding problems to the Code Doctors. Follow the ebb and flow of the SESUG 2011 conference with the #SESUG11 hashtag on Twitter.

Thursday, October 20, 2011

SAS 9.3 Statistics

I’m not really the one to tell you about statistical modeling improvements in SAS 9.3. You would rather get that from Ken Kleinman, coauthor of books like Using SAS for Data Management, Statistical Analysis, and Graphics. Two blog posts explore enhanced statistical modeling features in SAS. On some specific points, the changes find SAS catching up with features in R.

Example 9.7: New stuff in SAS 9.3-- Frailty models

Example 9.8: New stuff in SAS 9.3-- Bayesian random effects models in Proc MCMC

Wednesday, October 12, 2011

WUSS 2011, San Francisco

People are arriving in San Francisco for this year’s WUSS conference, which runs October 12–14. If you’re at the conference, remember to take your coding problems to the Code Clinic, and of course, take a look at the latest in SAS books at the Exhibit and Demo Areas.

Tuesday, October 11, 2011

SAS 9.3 ‘Very Compatible’

Paul Homes writing at platformadmin.com says SAS 9.3 is an easy migration, with a high degree of compatibility with SAS 9.2. And for those who never quite made it SAS 9.2, he suggests skipping over that release:

it makes much more sense to me to go straight to 9.3 – the direct migration path is supported and why do two migrations and two rounds of testing when you can do one?

Sunday, September 25, 2011

MWSUG ’11 Kansas City

If you are at the MidWest SAS Users Group (MWSUG) conference in Kansas City, September 25–27, look for Professional SAS Programmer’s Pocket Reference, 6th edition, on the shelf at the SAS Press exhibit in the SAS Innovation Center. The new pocket reference is one of several books you’ll find that cover SAS 9.3.

Friday, September 16, 2011

The Rebuilt PROC PRINT in SAS 9.3

A SAS Global Forum paper by Darylene Hecht, “PROC PRINT and ODS: Teaching an Old PROC New Tricks” provides a lightweight introduction to the new SAS 9.3 PRINT procedure. The paper was written months before SAS 9.3 was released, so some of the look and feel changed slightly, but you still get an idea of how much changed in the procedure.

Monday, September 12, 2011

Introducing Professional SAS Programmer’s Pocket Reference, 6th Edition

Two months after the release of SAS 9.3, Professional SAS Programmer’s Pocket Reference is up to speed on the new SAS release. The 6th edition, updated for SAS 9.3, is on its way today to warehouses in five states and should be widely available in a week or two.

More than a few things in Professional SAS Programmer’s Pocket Reference have stayed the same in the 17 years since the book’s first release. For example, the new edition is priced at $20.00, just 5 cents more than the price of the first edition in 1994. Yet you will immediately notice changes in the 6th edition, starting with the new, more portable format and full-color cover. Inside, the 6th edition reshuffles the chapters for the first time in a decade, removing the display manager chapter and combining a few of the shorter chapters to make room for new chapters for data set options and data step component objects. On a less obvious note, the new cover treatment makes the book more suitable for recycling when you are eventually done with it.

Thanks again to the many people who made suggestions and requests for the new edition. I couldn’t incorporate every suggestion, partly because the request I heard most often was to not make the book any longer, but I tried to make the new edition as useful as it could be for the largest group of readers.

Tuesday, August 23, 2011

Faster Log Notes

It seems as if log processing has been streamlined, so that log messages don’t take as much time. In the past, log messages slowed SAS down enough to become noticeable when a data step generates hundreds of log messages, but that seems to be less of an issue in SAS 9.3.

Sunday, August 14, 2011

An Issue With URLs

There is an issue with character encoding in the two SAS functions for URLs, URLENCODE and URLDECODE. URLs are uniform resource locators used on the Internet, such as web addresses. To work correctly, they need to be presented in the character encoding the Internet runs on, UTF-8. Yet SAS, as of release 9.3, has limited support for UTF-8, so the arguments to the URL functions might also be in the SAS session encoding.

To get around this potential stumbling block, there is a new system option in SAS 9.3 that lets you indicate the encoding of the argument to the URL functions. Set the option URLENCODING=SESSION if the argument is in the session encoding or URLENCODING=UTF8 if it is in UTF-8 encoding.

Friday, August 12, 2011

The 64-bit Observation Counter

The EXTENDOBSCOUNTER option is mentioned in the announcement for the new edition of Professional SAS Programmer’s Pocket Reference, and that made me realize I needed to write a post to explain what it is.

It is no longer a gee-whiz moment when a data table exceeds 2 billion rows. And with the computers of 2011, it is not such a big deal to be processing that many observations in a SAS program. So within a few years, it would have started to seem quaint that SAS can count observations only up to about 2 billion.

That is why SAS set its sights on a new observation counter that can count observations up to 9 quintillion. The new 64-bit observation counter is available in SAS 9.3, but it is not the default because it is not compatible with any prior SAS release. If you are working with data that could run into the billions of observations, though, you would do well to start using the new expanded observation counter as soon as you complete the transition from SAS 9.2 to SAS 9.3.

All this requires is the EXTENDOBSCOUNTER=YES option in the LIBNAME statement for the library. To use a 64-bit observation counter on an individual SAS data file, write the option as a data set option when you create the file.

The 64-bit observation counter will become the norm, I am sure, at some point in the future when the chances of encountering SAS 9.2 and earlier SAS releases become relatively slight. At this point, though, it is important to note that the EXTENDOBSCOUNTER=YES option makes SAS data sets incompatible with SAS 9.2, so it isn't the right move in a company that is still using SAS 9.2 on some machines. At worst, it would force you to make copies of SAS libraries in SAS 9.3 before you could use them in SAS 9.2 — not a big deal if there are just a few million observations, but a process you would rather avoid if the observation count is closer to a billion.

Professional SAS Programmer’s Pocket Reference: A Small Book for Working With Big Data

A new edition of Professional SAS Programmer’s Pocket Reference covering SAS 9.3 will be available in September 2011.


“Big analytics” is one of the themes of SAS 9.3, released one month ago on July 12. SAS was always noted for its ability to process large-scale data, and SAS 9.3 adds new capabilities to make it more nimble in getting results from the “big data” that businesses increasingly rely on for a competitive edge.

Author Rick Aster has responded to this theme of a nimble approach to big data — and to the new features in SAS — with Professional SAS Programmer’s Pocket Reference, 6th edition. This new edition of Aster’s popular compact reference guide for SAS is updated with the new features of SAS 9.3. At the same time, it is almost 20% lighter than the previous edition, in spite of having the same number of pages and the same amount of content as before. The lighter weight, made possible by improved paper and design, is an advantage for SAS professionals who increasingly find themselves going from one place to another in the course of their work in SAS.

The new 6th edition of Professional SAS Programmer’s Pocket Reference covers new features in SAS 9.3 that have to do with big data, such as the EXTENDOBSCOUNTER option that makes it easier to manage files that have more than 2 billion records.

The book also covers new SAS 9.3 features that allow a more modular approach to programming, easier internationalization for programs that are used in multiple countries, and new ways to use SAS on the Internet.

Publisher Breakfast Books hopes to have the new book ready for September 12, when the first of the fall SAS conferences gets underway.


Catalog page:

http://www.breakfast.us/catalog/psppr.html

SAS is a registered trademark of SAS Institute Inc.

Tuesday, August 9, 2011

Formatting Long and Short Durations With a Picture Format

Two new picture directives are specifically designed for formatting time durations. This allows you to use the FORMAT procedure to create formats to show time durations even if they are particularly long or short periods of time.

Using a picture format for a time duration was somewhat perilous in previous SAS releases, because the time picture directives were oriented toward time of day. If a duration ran more than 24 hours, the picture format would neglect the whole days that had elapsed and report only the fractional days. A time of 5 days, 5 seconds would show up just as 5 seconds.

That is no longer a problem with the %n picture directive, which fills in the previously missing days. A picture such as "%n:%H:%M:%S" gives you the duration in days, hours, minutes, and seconds.

For shorter time periods where fractions of seconds matter, the %s picture directive provides the fractional part of the seconds value, including the decimal point.

Sunday, August 7, 2011

Using a Function As a Format

A subtle change in the FORMAT procedure in SAS 9.3 makes it possible to use a function as a format.

SAS already allowed you to use another format within a value format. For example, you could create a format that provided special formatted values for the numbers 1 and 2, then formatted all other values with the BEST format. The syntax for this involves writing the format in brackets, for example:

other = [best8.]
To pass the range off to a function instead, write the function name and parentheses inside the brackets. The example in Base SAS 9.3 Procedures Guide suggests creating a function called QFMT (in the FCMP procedure), then using it in a value format, also called QFMT, and defined in this statement:
   value qfmt 
         other=[qfmt()];

There is only one range, OTHER, in this format because all formatted values are generated by the function.

This approach essentially makes it possible to use all the computational possibilities of the FCMP procedure, but them access the resulting value as a format.

Thursday, August 4, 2011

Deleting Macros in SAS 9.3

In previous SAS releases, it was possible to delete a macro after you had defined it, but you had to know the name of the WORK.SASMACR library, the entry type of a macro entry, and the workings of the PROC CATALOG step in order to do the deletion. SAS 9.3 simplifies the process of deleting a macro with a new macro statement, the %SYSMACDELETE statement.

Write the macro name in this statement, and the macro is deleted. For example, to delete the RESETP macro, write:

%SYSMACDELETE RESETP;

If you are not sure the macro exists, use the NOWARN option to delete it with no warning message if it turns the macro does not exist:

%SYSMACDELETE RESETP / NOWARN;

Friday, July 29, 2011

New SOAP Functions in SAS 9.3

SAS 9.3 includes several functions for SOAP calls, the most basic of which is SOAPWEB. SOAP is a web service protocol that works by exchanging XML. A web service, in SOAP, is similar to a procedure or CALL routine in the way you put a question to it and it returns a potentially complex answer, but:

  • The request takes the form of an XML object, which may include data and a request for a specific kind of processing or analysis.
  • The service can be located on any web server.
  • The results are returned in the form of an XML object.

In the SOAPWEB function, you provide filerefs for the input and output XML and a URL, a web address, for the web service. There are another 11 arguments, if needed, for parameters and authentication.

The return value is a return code. It is not directly mentioned in the function documentation, but presumably it returns a positive integer code value if the Internet or the web service are not found.

There are five other SOAP functions that do essentially the same thing. Each provides a slightly different approach to authentication.

Sunday, July 24, 2011

Checkpoint and Restart Modes

SAS 9.3 lets you divide a program into segments, each of which consists of one or more steps, so that you can automatically restart the program where it left off. This is useful for large-scale programs that may occasionally run into trouble because of their extensive use of system resources.

To simplify somewhat, you divide a program into segments by writing statement labels before the first DATA or PROC statement and selected DATA and PROC statements that follow, plus one more at the end of the program. These labeled statements are the points where the program will resume when you restart it after it fails. The reason the program is able to restart is that SAS saves information about the state of the session in a checkpoint library, whenever the program gets to the next checkpoint without error.

This is all controlled by system options: LABELCHKPT to run a program using checkpoints; LABELCHKPTLIB= to select a checkpoint library other than the WORK library; and if a program fails, LABELRESTART to restart a program at the last checkpoint before the point of failure. SAS also recommends the NOWORKINIT, NOWORKTERM, ERRORCHECK, and ERRORABEND system options to preserve the WORK library and collect valid checkpoint data. If you are running in an environment where it isn’t practical to preserve the WORK library, rewrite the program to use another library instead.

I couldn’t count the number of times I’ve restarted a program manually by erasing the beginning steps in a program, then adding in the appropriate OPTIONS and TITLE statements, in order to rerun the end of it. That approach fails so often, though, that I often prefer to restart a program from the beginning, despite the use of computer resources. The checkpoint and restart modes take the guesswork out of restarting a program, so that you can restart somewhere in the middle without having to create a modified version the program.

If you don’t know where a program might fail, you might prefer to use step checkpoints rather than labeled checkpoints. In this mode, SAS automatically creates a checkpoint for every step. Use the system options STEPCHKPT, STEPCHKPTLIB, and STEPRESTART.

If there is a step that has to run every time, even in restart mode, mark the step with this statement, before the DATA or PROC statement:

CHECKPOINT EXECUTE_ALWAYS;

With this statement, the step runs every time the program runs, regardless of restart mode.

Thursday, July 21, 2011

The HHMMSS Informat

The HHMMSS informat, new in SAS 9.3, reads time fields in hours, minutes, and seconds that do not contain punctuation. The data value can represent either a time of day or a duration. This informat is especially useful for fields that do not always contain minutes and seconds. These are examples of valid time of day fields for the HHMMSS informat:

09
0944
094401

With more than six digits in a field, the extra digits on the left are treated as part of the hours — useful if data values may go beyond 99 hours. The HHMMSS informat also reads fields that contain punctuation, and for these fields, it works the same as the TIME informat.

HHMMSS is the only new general-purpose informat in SAS 9.3. No new general-purpose formats were introduced. There are three new formats of note, which are for locale-based formatting of SAS datetime values including time zone.

The new documentation clarifies that 5-digit year numbers, though supported by some informats and formats, are considered an undocumented feature in SAS.

Wednesday, July 20, 2011

SAS 9.3 Macro Language Features for Context

SAS 9.3 adds macro functions, macro statements, and automatic macro variables to allow macro language programming to have more contextual awareness.

The new macro functions and macro statements have to do with macros in the current session. The %SYSMACEXIST and %SYSMACEXEC functions search for a macro by name, telling you whether the macro has been defined in the current session and whether it is currently executing. The %SYSMEXECDEPTH and %SYSMEXECNAME functions allow you to discover the names of all currently executing macros.

The %SYSMACDELETE statement deletes macros from the current session. The %SYSMSTORECLEAR statement closes the stored compiled macro library. Both statements have to be used with care; you cannot delete a currently running macro, nor can you deassign a stored compiled macro library if one if its macros is running.

The new automatic macro variables allow you to write SAS code that adjusts for the sizes of objects, along with a few related considerations. SYSSIZEOFLONG, SYSADDRBITS, SYSSIZEOFPTR, and SYSSIZEOFUNICODE indicate the size of a long integer, address, pointer, and Unicode character, respectively. SYSENDIAN provides the byte sequence of numeric values.

Another new automatic macro variable, SYSODSESCAPECHAR, tells you the ODS escape character. Using this macro variable, you can add the ODS escape character to a string, or check for (and encode) escape character conflicts in character data, without having to know what the escape character is.

Tuesday, July 19, 2011

HTML in SAS 9.3

With SAS 9.3, it’s pretty clear that SAS wants everyone to start using HTML as their base output format. That’s actually been a good idea since SAS 9 was first released. HTML output is easier to read, takes less paper if you print it, and is easier to deliver to the world or to integrate into other documents than the “print file,” the paginated monospace text of the Listing destination format that SAS relied on for the previous four decades. And though HTML might seem more complicated than Listing, it’s not actually any harder to create. You will find a series of changes in SAS 9.3 that make the transition easier.

  1. HTML is now the default destination if you are running SAS in the GUI windowing environment. That is, you don’t have to use any ODS statements to get HTML output.
  2. ODS graphics have been separated from SAS/GRAPH and moved entirely to base SAS. ODS graphics are now included in your HTML output with just a base SAS license. If you use the Listing destination, you can still use ODS graphics, but you have to look at the graphics separately, an approach we will all quickly come to see as an unnecessary inconvenience.
  3. There are improvements in the DOCUMENT procedure to make it more compatible with the PRINT procedure, and to allow arbitrary text to be added to a document. These changes are especially useful if you are using the DOCUMENT procedure for formatted output such as HTML output.
  4. There was a sense that ODS style attributes were converging with CSS attributes when SAS 9 was released, and that continues with several new ODS style attributes in SAS 9.3. The new attributes such as WHITESPACE, PADDING, and BORDERCOLLAPSE provide ODS support for attributes that you would take for granted in CSS formatting of an HTML document.
  5. There is a new style, HTMLBlue, that enhances readability and makes more efficient use of space when displaying tables. It also, in my opinion, just looks more glamorous in a graphic design sense. The HTMLBlue style is the default style for the HTML destination in SAS 9.3.

There are two tricky points with HTML being the default destination. First, if you were already running a program in SAS 9.2, you may want to keep its output style the same as before. There are three new system options that, used together, can provide the SAS 9.2 behavior, so that you don’t have to rewrite your programs to keep them from changing.

Second, while HTML is the default destination in the GUI windowing environment, Listing remains the default in all other environments. This means the output format could change just because you run a program in a different way. If this is a problem, add ODS statements to the program to explicitly select the destination you want and close the destination you don’t want. This is not as big a change as it might sound. You can start a program with this statement to close all ODS destinations, without having to know which ones are open:

ods _all_ close;

Follow this with an ODS HTML, ODS LISTING, or other ODS statement to open the destination of your choice. This is all it takes to get consistent ODS output from a program, not affected by where you run it.

If you have older programs that use the PRINTTO procedure to select a destination file for output, you need to know that the PRINTTO procedure works only for the Listing destination, and not for any other ODS destination. Even for the Listing destination, it is simpler to select destination files using the ODS LISTING statement. However, continue to use the PRINTTO procedure if you use it to temporarily reroute the SAS log, or use it with the Listing destination to combine the log and output in the same file.

Saturday, July 16, 2011

Breaking Up the Dictionary

The last time we saw it, SAS Language Reference: Dictionary was 2,225 pages. Two of its chapters accounted for half the length of the book: Chapter 4, “Functions and CALL Routines,” provided 920 pages, and Chapter 7, “SAS System Options,” 254 more. If the SAS documentation team had simply expanded this book for SAS 9.3, it would have been nearly 3,000 pages in length — perhaps a little too much for a single book.

book cover

That explains the need for seven of the new books introduced for SAS 9.3. The two chapters I mentioned have turned into SAS Functions and CALL Routines: Reference (1,027 pages) and SAS System Options: Reference (333 pages). Other volumes cover statements, data set options, and formats and informats. Part II of the Dictionary has been replaced with a separate volume on component objects. Appendix 1, “DATA Step Debugger,” has become the core of a new book, Base SAS Utilities: Reference.

This is the most prominent change in the shape of the documentation, but there are others, as specific features are moved from formal to topical locations. For example, the OPTIONS, OPTLOAD, and OPTSAVE procedures, which have to do with system options, are now covered in SAS System Options: Reference. (The same chapters are repeated in Base SAS Procedures Guide, but just, I imagine, one last time, to give readers a chance to get used to looking for procedures in their new locations. Base SAS Procedures Guide weighs in at 1,794 pages, while some 3,500 pages of documentation for other base SAS procedures are found in other books.)

Thursday, July 14, 2011

Names, and Moving Away From Display Manager

The rules for SAS names get a little more expansive in SAS 9.3. We have long been able to use arbitrary characters and accented letters in variable names and in the names of some SQL tables. The characters of Asian scripts are also available with the right character encoding. Now the same expanded character set applies to the names of SAS data sets and item stores — but with a few cautions.

Names that aren’t traditional SAS names have to be written as name literals — a quoted string marked with the letter N for name. System options can permit or prohibit the extended use of characters in names. The option is VALIDVARNAME=ANY to allow additional characters in variable names, VALIDVARNAME=V7 to prohibit characters other than ASCII letters, digits, and underscore. In SAS 9.3, the system option VALIDMEMNAME=EXTEND allows special characters in member names. Use VALIDMEMNAME=COMPATIBLE to limit yourself to member names that will work in SAS 9.2 and older versions.

The use of special characters is limited to SAS data sets (including views) and item stores. Catalogs and other member types are still limited to traditional SAS names.

These punctuation characters cannot be used in SAS file names:

/ \ * ? " < > | : -

The null character is also not permitted.

There are restrictions on the use of these characters (so don’t use them if you don’t have to):

% & . # $

Spaces at the beginning or end of a name literal are not included in the member name. Also, like all SAS names, member names are case-insensitive, even when you write them as name literals.

A member name can’t be more than 32 bytes long. Depending on the character encoding, you could be limited to as few as 8 characters.

Here is the big limitation on extended member names, though: they can’t necessarily be used in the interactive environment. From SAS 9.3 System Options: Reference:

The windowing environment supports the extended rules in the Editor, Log, and Output windows when VALIDMEMNAME=EXTEND is set. In most SAS windows, these extended rules are not supported. For example, these rules are not supported in SAS Explorer, the VIEWTABLE window, and windows that you open using the Solutions menu.

The list of incompatibilities attached to the legacy display manager and full-screen windows of the SAS interactive environment is growing, not shrinking, with the new release. Given the issues with names and character encodings, along with other concerns, SAS is marginalizing its interactive environment. I don’t think that means SAS wants us all to switch to batch mode, though that will be necessary for many applications in SAS 9.3. I believe SAS remains committed to interactive users, so it may have an eye toward rebuilding its entire user interface or replacing it with something more contemporary and perhaps more flexible and modular in a future release.

Wednesday, July 13, 2011

New SAS Log and Logging Features in SAS 9.3

First things first — what’s new in the SAS log? The log in SAS 9.3 is basically the same as what we have seen since SAS 7, but there is one new feature of note.

The RESETLINE statement, a global statement, resets the program line numbers that appear with program lines in the log. The RESETLINE statement does not affect the line number of the program line it appears on, but the line numbers start over at 1 beginning with the next program line. You might use the RESETLINE statement at the beginning of a SAS program so that the program lines in the log correspond to specific places in the program. Another place where you might use the RESETLINE statement is at the end of an autoexec file.

Separately, there are extensive changes in the logging facility. It was introduced in SAS 9.2 but comes into its own in SAS 9.3. The logging facility is a cleaner way to generate log messages and deliver them, not particularly to the SAS log, but to external destinations that now may include “third-party databases and Java classes.” A log event indicates a logger (category), along with a log message (text) and a message level (or threshold), such as info or error. You can use functions, component objects, or macro language to create log events. Filters in the logging facility determine which messages go to which destinations.

SAS programs have always been able to write text in the log as a message. The idea here, though, is to have an object-oriented approach to messages so that selected messages can be delivered in particular ways. This could include adding messages to web pages along with the program’s results. Apparently it could also include triggering system-level scripts that respond to specific events or problems that may occur in the execution of a SAS program. As an example, a logging event could result in a SAS program being rerun later if the program fails because of a certain kind of database error. It’s easy to imagine a program for which you would want this to happen automatically, instead of having a person note the failure, read the SAS log, identify the point of failure, and manually relaunch the SAS program.

Tuesday, July 12, 2011

SAS 9.3 Is Available

SAS announced the availability of the newest SAS release, SAS 9.3, early today. Right out of the box, SAS 9.3 shows substantial changes that extend SAS’s reach, but that in some ways represent a break from the past. SAS users generally will need to be aware of the major new features even if they are not using them immediately, in order to navigate the changes successfully.

This blog will consider new features in SAS 9.3 as I come upon them. At the same time, I will be revising my books Professional SAS Programmer’s Pocket Reference, Routine SAS, Professional SAS Programming Shortcuts, and Professional SAS Programming Secrets, along with the Global Statements web site, to bring them up to date with the latest SAS features. I will be writing about my progress and difficulties in that work and asking for suggestions and priorities for each book. Given its topic, this blog is not meant to be open-ended. If all goes well, it may have run its course in a few months.

The important disclaimers: I am not affiliated with SAS and have no inside knowledge, so everything here is based on published information. SAS and other SAS product names are registered trademarks of SAS Institute Inc. in the United States and other countries. If you are interested in licensing SAS, please contact SAS for further information. We are all taking our first look at SAS 9.3 at this point, so some of my ideas here may be untested and speculative, and I may change my mind as I go along. If something I say doesn’t seem quite right, please take the time to comment. Thanks for reading!