Tuesday 25 November 2014

Syntax strings in different modes

Earlier I wrote a post about syntax strings versus literal strings, which involved me diving into the Uniface manuals to check my facts.  I talked about using $syntax to convert a literal string into a syntax string - very useful!  

However, what I didn't know before was that you can have different modes.  This is something that appears to have been added in Uniface 9.4.01, without me noticing.

There's a lovely table in the Uniface manuals to describe them, but I've tested their examples and I think there are some inaccuracies.  So I'm going to try and lay them out for you here...

Classic

This is the default behaviour, but is also represented by '%[X]'.  In this case, the usual pattern matching rules apply, with syntax codes being used as wildcards to represent patterns of characters.

  • Proc code: $syntax("D&G")
  • Syntax string: '%[X]%D&G'
  • Matches: "DOG", "DIG", "DUG", etc.

You can see the syntax string starts with the mode defined in this case, which is optional for classic mode, because this is the default.

Case Sensitive

In this mode, all characters will only match with characters of the same case.  Also, syntax codes will be treated as their literal characters, and not as wildcards.

  • Proc code: $syntax("D&G","CS")
  • Resulting syntax string: '%[CS]D%&G%[X]'
  • Matches: "D&G"

You can see the syntax string starts with the mode defined in this case, but it also ends with the exit mode, switching back to the default/classic mode.

The manuals suggest that in the proc code the mode is passed in without the double quotes, but I found this had to be a string for it to work.  They also suggest that the mode "S" can be used instead of "CS" for the same thing - I found that only "CS" works in proc code, but you can use '%[CS]' or '%[S]' in syntax strings you write yourself.


Case Insensitive

In this mode, all characters will match with characters of either upper or lower case.  Again, syntax codes will be treated as their literal characters, and not as wildcards.

  • Proc code: $syntax("D&G","CI")
  • Resulting syntax string: '%[CI]D%&G%[X]'
  • Matches: "D&G", "d&g", "D&g" and "d&G"

Again, I had to use a string to define the mode in the proc code, and only "CI" worked, but '%[CI]' and '%[I]' both worked in syntax strings I wrote myself.

NLS Locale

In this mode, all characters will match with characters of either upper or lower case, depending on the National Language Support (NLS) locale that the system is in (can be checked or set using $nlslocale).  Again, syntax codes will be treated as their literal characters, and not as wildcards.

  • Proc code: $syntax("i#B","NLS")
  • Resulting syntax string: '%[NLS]i%#B%[X]'
  • Matches: If your local is Turkish (tr_TR) then "i#B" and "İ#b", but not "I#B"

Once more, I had to use a string to define the mode in the proc code, and only "NLS" worked, but '%[NLS]' and '%[N]' both worked in syntax strings I wrote myself.

Mixing modes

You can also define a combination of modes in a single pattern, something like this...
  • Proc code: $syntax("%[CI]D%&%[CS]G%[X]")
  • Resulting syntax string: '%[CI]D%&%[CS]G%[X]'
  • Matches: "D&G" and "d&G"

There's a slight typo in the manuals with the proc code here, as they have missed the "%" from the beginning!  

Summary: You can use different modes in syntax strings, either specifying the case-sensitivity or relying on the NLS locale, but be careful if you're using $syntax because you'll lose the wildcards.

No comments:

Post a Comment