Topics

VanWeerthuizenian Expressions


flo.gehrke <flo.gehrke@...>
 

It has been discussed several times that NT doesn't master Boolean Expressions (cf message #21773). And we considered tools like DOS Findstr, Agent Ransack etc to be used in this case. It should also be mentioned that, to some extent, we could simulate Boolean Expressions with RegEx.

Some years ago, Wayne VanWeerthuizen described another approach in his 'NoteTab Tutorial Control Structures v003.OTL' (see the 'File' section of this group).

I picked up another topic (#22762) and tried to find a solution for that job based on instructions given by Wayne in his tutorial.

Example: We've got five lines...

tR4hGGUK
2UJeiy9m
WbNDSk9e
Mytrip12
My.tip12

The task is to select lines which fulfill five criteria:

(A) 8 characters
(B) at least 2 upper characters
(C) at least 2 lower characters
(D) at least one digit
(E) no punctuation characters or spaces

According with Wayne's tutorial, I tried to translate his instructions to the following clip:

^!Set %Lines%=^$GetTextLineCount$
^!Set %Row%=1
:Loop
^!Set %A%=0; %B%=0; %C%=0; %D%=0; %E%=0
^!Jump ^%Row%
; Condition #1: a string of 8 characters
^!If ^$StrPos("&#92;b.{8}&#92;b";"^$GetLine$";R)$>0 ^!Set %A%=1
; Condition #2: at least 2 upper characters in line
^!If ^$StrPos("[[:upper:]][[:alnum:]]*[[:upper:]]";"^$GetLine$";R)$>0 ^!Set %B%=1
; Condition #3: at least 2 lower letters in line
^!If ^$StrPos("[[:lower:]][[:alnum:]]*[[:lower:]]";"^$GetLine$";R)$>0 ^!Set %C%=1
; Condition #4: at least one digit in line
^!If ^$StrPos("[0-9]";"^$GetLine$";R)$>0 ^!Set %D%=1
; Condition #5: No punctuation characters or spaces in line
^!If ^$StrPos("[&#92;x20&#92;pP]";"^$GetLine$";R)$>0 ^!Set %E%=1
; Logical test of all criteria
^!If ^$Calc(MIN(^%A%;MIN(^%B;MIN(^%C%;MIN(^%D%;1-^%E%)))))$=1 Next Else Skip
; If whole expression is true then append line to %Hits%
^!Set %Hits%=^%Hits%^$GetLine$^P
^!Inc %Row%
^!If ^$GetRow$ < ^%Lines% Loop
; Output hits
^!Info [L]^%Hits%
^!ClearVariables

Actually, this clip leads to the intended result and outputs line #1, #2, and #3 only, exactly matching those criteria. That is, the clip simulates a Boolean query like 'A and B and C and D and E and not E' reproduced with '^$Calc$'.

Of course, this particular task might be resolved in easier ways. However, what matters here is this specific approach which - as Wayne explained - could apply to much more complex queries (with AND, OR, XOR, NOT etc). In cases like that, it would be difficult to use RegEx, and compound loops would end in extremely long "IF, ELSE IF, ELSE IF Chains" (Wayne).

What I would like to know:

Is anyone (also Wayne himself) still using such expressions and could give us more examples. Regrettably, Wayne's Tutorial was never finished and has remained a highly instructive fragment only. Also, it's lacking complete working examples.

Regards,
Flo


joy8388608
 

I sometimes do things like this but never heard of the tutorial you mentioned. Where would I find it?


I'm not clear what kind of examples you are looking for but maybe seeing the tutorial would make it clearer..


As to your example, I would write two lines slightly differently since it seems much clearer to the way I think.


^!Set %Lines%=^$GetTextLineCount$

^!Set %Row%=1

:Loop

^!Set %A%=0; %B%=0; %C%=0; %D%=0; %E%=0

^!Jump ^%Row%

; Condition #1: a string of 8 characters

^!If ^$StrPos("\b.{8}\b";"^$GetLine$";R)$>0 ^!Set %A%=1

; Condition #2: at least 2 upper characters in line

^!If ^$StrPos("[[:upper:]][[:alnum:]]*[[:upper:]]";"^$GetLine$";R)$>0 ^!Set %B%=1

; Condition #3: at least 2 lower letters in line

^!If ^$StrPos("[[:lower:]][[:alnum:]]*[[:lower:]]";"^$GetLine$";R)$>0 ^!Set %C%=1

; Condition #4: at least one digit in line

^!If ^$StrPos("[0-9]";"^$GetLine$";R)$>0 ^!Set %D%=1


; Condition #5: No punctuation characters or spaces in line

;^!If ^$StrPos("[\x20&#92;pP]";"^$GetLine$";R)$>0 ^!Set %E%=1

^!If ^$StrPos("[\x20\pP]";"^$GetLine$";R)$=0 ^!Set %E%=1


; Logical test of all criteria

^!Prompt ^%A% + ^%B + ^%C% + ^%D% + ^%E%


;^!If ^$Calc(MIN(^%A%;MIN(^%B;MIN(^%C%;MIN(^%D%;1-^%E%)))))$=1 Next Else Skip

^!If ^$Calc(^%A% + ^%B + ^%C% + ^%D% + ^%E%)$ = 5 Next Else Skip

; If whole expression is true then append line to %Hits%

^!Set %Hits%=^%Hits%^$GetLine$^P

^!Inc %Row%

^!If ^$GetRow$ < ^%Lines% Loop

; Output hits

^!Info [L]^%Hits%

^!ClearVariables


Joy



--- In ntb-clips@..., <flo.gehrke@...> wrote:

It has been discussed several times that NT doesn't master Boolean Expressions (cf message #21773). And we considered tools like DOS Findstr, Agent Ransack etc to be used in this case. It should also be mentioned that, to some extent, we could simulate Boolean Expressions with RegEx.

Some years ago, Wayne VanWeerthuizen described another approach in his 'NoteTab Tutorial Control Structures v003.OTL' (see the 'File' section of this group).

I picked up another topic (#22762) and tried to find a solution for that job based on instructions given by Wayne in his tutorial.

Example: We've got five lines...

tR4hGGUK
2UJeiy9m
WbNDSk9e
Mytrip12
My.tip12

The task is to select lines which fulfill five criteria:

(A) 8 characters
(B) at least 2 upper characters
(C) at least 2 lower characters
(D) at least one digit
(E) no punctuation characters or spaces

According with Wayne's tutorial, I tried to translate his instructions to the following clip:

^!Set %Lines%=^$GetTextLineCount$
^!Set %Row%=1
:Loop
^!Set %A%=0; %B%=0; %C%=0; %D%=0; %E%=0
^!Jump ^%Row%
; Condition #1: a string of 8 characters
^!If ^$StrPos("\b.{8}\b";"^$GetLine$";R)$>0 ^!Set %A%=1
; Condition #2: at least 2 upper characters in line
^!If ^$StrPos("[[:upper:]][[:alnum:]]*[[:upper:]]";"^$GetLine$";R)$>0 ^!Set %B%=1
; Condition #3: at least 2 lower letters in line
^!If ^$StrPos("[[:lower:]][[:alnum:]]*[[:lower:]]";"^$GetLine$";R)$>0 ^!Set %C%=1
; Condition #4: at least one digit in line
^!If ^$StrPos("[0-9]";"^$GetLine$";R)$>0 ^!Set %D%=1
; Condition #5: No punctuation characters or spaces in line
^!If ^$StrPos("[\x20\pP]";"^$GetLine$";R)$>0 ^!Set %E%=1
; Logical test of all criteria
^!If ^$Calc(MIN(^%A%;MIN(^%B;MIN(^%C%;MIN(^%D%;1-^%E%)))))$=1 Next Else Skip
; If whole expression is true then append line to %Hits%
^!Set %Hits%=^%Hits%^$GetLine$^P
^!Inc %Row%
^!If ^$GetRow$ < ^%Lines% Loop
; Output hits
^!Info [L]^%Hits%
^!ClearVariables

Actually, this clip leads to the intended result and outputs line #1, #2, and #3 only, exactly matching those criteria. That is, the clip simulates a Boolean query like 'A and B and C and D and E and not E' reproduced with '^$Calc$'.

Of course, this particular task might be resolved in easier ways. However, what matters here is this specific approach which - as Wayne explained - could apply to much more complex queries (with AND, OR, XOR, NOT etc). In cases like that, it would be difficult to use RegEx, and compound loops would end in extremely long "IF, ELSE IF, ELSE IF Chains" (Wayne).

What I would like to know:

Is anyone (also Wayne himself) still using such expressions and could give us more examples. Regrettably, Wayne's Tutorial was never finished and has remained a highly instructive fragment only. Also, it's lacking complete working examples.

Regards,
Flo


joy8388608
 

Some more thoughts...


I have seen some reasons to do things similar to this, but it's often better to do it another way to save processing time (if that matters any more :) )


The code could have used one variable set to zero and incremented each time a test passed and the line accepted if the var = 5. I do like the original way since it lets you see which tests passed and which failed.


For this example, a line is not accepted when any test fails so it is a waste to continue testing. Instead of setting a flag, a failure for any test could just move on to the next line to be tested.


The only example I can think of right now where something kinda sorta like this would be useful is come code I have that acts on different combinations of choices such as Case Sens, RegEx on/off, etc. For something like this, I usually do something (usually in a wizard) like 


If case sens off, set Flag1 to 0 else set it to 1.

If regexp off, set Flag2 to 0 else set it to 2.

If don't wantx, set Flag3 to 0 else set it to 4.

etc


The sum then tells what combination you have... 2 means want regexp on with the other two off, 5 means case sens on, wantx but regexp off and so on.


Joy







--- In ntb-clips@..., <mycroftj@...> wrote:

I sometimes do things like this but never heard of the tutorial you mentioned. Where would I find it?


I'm not clear what kind of examples you are looking for but maybe seeing the tutorial would make it clearer..


As to your example, I would write two lines slightly differently since it seems much clearer to the way I think.


^!Set %Lines%=^$GetTextLineCount$

^!Set %Row%=1

:Loop

^!Set %A%=0; %B%=0; %C%=0; %D%=0; %E%=0

^!Jump ^%Row%

; Condition #1: a string of 8 characters

^!If ^$StrPos("\b.{8}\b";"^$GetLine$";R)$>0 ^!Set %A%=1

; Condition #2: at least 2 upper characters in line

^!If ^$StrPos("[[:upper:]][[:alnum:]]*[[:upper:]]";"^$GetLine$";R)$>0 ^!Set %B%=1

; Condition #3: at least 2 lower letters in line

^!If ^$StrPos("[[:lower:]][[:alnum:]]*[[:lower:]]";"^$GetLine$";R)$>0 ^!Set %C%=1

; Condition #4: at least one digit in line

^!If ^$StrPos("[0-9]";"^$GetLine$";R)$>0 ^!Set %D%=1


; Condition #5: No punctuation characters or spaces in line

;^!If ^$StrPos("[\x20&#92;pP]";"^$GetLine$";R)$>0 ^!Set %E%=1

^!If ^$StrPos("[\x20\pP]";"^$GetLine$";R)$=0 ^!Set %E%=1


; Logical test of all criteria

^!Prompt ^%A% + ^%B + ^%C% + ^%D% + ^%E%


;^!If ^$Calc(MIN(^%A%;MIN(^%B;MIN(^%C%;MIN(^%D%;1-^%E%)))))$=1 Next Else Skip

^!If ^$Calc(^%A% + ^%B + ^%C% + ^%D% + ^%E%)$ = 5 Next Else Skip

; If whole expression is true then append line to %Hits%

^!Set %Hits%=^%Hits%^$GetLine$^P

^!Inc %Row%

^!If ^$GetRow$ < ^%Lines% Loop

; Output hits

^!Info [L]^%Hits%

^!ClearVariables


Joy



--- In ntb-clips@..., <flo.gehrke@...> wrote:

It has been discussed several times that NT doesn't master Boolean Expressions (cf message #21773). And we considered tools like DOS Findstr, Agent Ransack etc to be used in this case. It should also be mentioned that, to some extent, we could simulate Boolean Expressions with RegEx.

Some years ago, Wayne VanWeerthuizen described another approach in his 'NoteTab Tutorial Control Structures v003.OTL' (see the 'File' section of this group).

I picked up another topic (#22762) and tried to find a solution for that job based on instructions given by Wayne in his tutorial.

Example: We've got five lines...

tR4hGGUK
2UJeiy9m
WbNDSk9e
Mytrip12
My.tip12

The task is to select lines which fulfill five criteria:

(A) 8 characters
(B) at least 2 upper characters
(C) at least 2 lower characters
(D) at least one digit
(E) no punctuation characters or spaces

According with Wayne's tutorial, I tried to translate his instructions to the following clip:

^!Set %Lines%=^$GetTextLineCount$
^!Set %Row%=1
:Loop
^!Set %A%=0; %B%=0; %C%=0; %D%=0; %E%=0
^!Jump ^%Row%
; Condition #1: a string of 8 characters
^!If ^$StrPos("\b.{8}\b";"^$GetLine$";R)$>0 ^!Set %A%=1
; Condition #2: at least 2 upper characters in line
^!If ^$StrPos("[[:upper:]][[:alnum:]]*[[:upper:]]";"^$GetLine$";R)$>0 ^!Set %B%=1
; Condition #3: at least 2 lower letters in line
^!If ^$StrPos("[[:lower:]][[:alnum:]]*[[:lower:]]";"^$GetLine$";R)$>0 ^!Set %C%=1
; Condition #4: at least one digit in line
^!If ^$StrPos("[0-9]";"^$GetLine$";R)$>0 ^!Set %D%=1
; Condition #5: No punctuation characters or spaces in line
^!If ^$StrPos("[\x20\pP]";"^$GetLine$";R)$>0 ^!Set %E%=1
; Logical test of all criteria
^!If ^$Calc(MIN(^%A%;MIN(^%B;MIN(^%C%;MIN(^%D%;1-^%E%)))))$=1 Next Else Skip
; If whole expression is true then append line to %Hits%
^!Set %Hits%=^%Hits%^$GetLine$^P
^!Inc %Row%
^!If ^$GetRow$ < ^%Lines% Loop
; Output hits
^!Info [L]^%Hits%
^!ClearVariables

Actually, this clip leads to the intended result and outputs line #1, #2, and #3 only, exactly matching those criteria. That is, the clip simulates a Boolean query like 'A and B and C and D and E and not E' reproduced with '^$Calc$'.

Of course, this particular task might be resolved in easier ways. However, what matters here is this specific approach which - as Wayne explained - could apply to much more complex queries (with AND, OR, XOR, NOT etc). In cases like that, it would be difficult to use RegEx, and compound loops would end in extremely long "IF, ELSE IF, ELSE IF Chains" (Wayne).

What I would like to know:

Is anyone (also Wayne himself) still using such expressions and could give us more examples. Regrettably, Wayne's Tutorial was never finished and has remained a highly instructive fragment only. Also, it's lacking complete working examples.

Regards,
Flo


flo.gehrke <flo.gehrke@...>
 

I sometimes do things like this but never heard of the tutorial
you mentioned. Where would I find it?
I'm not clear what kind of examples you are looking for but maybe
seeing the tutorial would make it clearer..
Joy,

It's the...

...'NoteTab Tutorial Control Structures v003.OTL' (see the
'File' section of this group).
http://tech.groups.yahoo.com/group/ntb-clips/files

The chapter: "Appendixes", "Performing Boolean Calculations".

Note that Wayne always uses a comma like 'MIN(A,B)' where a semicolon is needed!

It would be nice to see some examples of the way you are using solutions like that ;-)

Regards,
Flo


Wayne VanWeerthuizen
 

"Note that Wayne always uses a comma like 'MIN(A,B)' where a semicolon is needed!"

Sorry! I'll fix that in the next revision.

Funny how in that same tutorial I warned that using commas instead of semicolons is a common source of errors. Maybe I meant my own errors in particular, LOL.


Thomas Gruber Yahoo
 

Hi Wayne,
if you’re willing to continue/resume woorking on the tutorial, feel free to contact me as a tester or just for proofreading. 4 eyes often see more than 2. If this becomes a live project again, I’m happy to contribute to it if I can.
Kind regards
Thomas

Am 18.09.2020 um 09:54 schrieb Wayne VanWeerthuizen <waynemv@...>:

"Note that Wayne always uses a comma like 'MIN(A,B)' where a semicolon is needed!"

Sorry! I'll fix that in the next revision.

Funny how in that same tutorial I warned that using commas instead of semicolons is a common source of errors. Maybe I meant my own errors in particular, LOL.





Thomas Gruber Yahoo
 

Hi Waine, all,
OT:
unfortunately comma/semicolon is a very wide spread source of errors. If you ever worked in a mixed language environment like I do all the time (german/english) with things like Excel, you’ll know what I mean. In formulas, the german version uses semicolon as a separator between parameters, the english one uses comma. Luckily, unlike older versions Excel now translates this automatically when a file is opened in a different language installation. But when you’re writing a formula you always have to consider in which environment you are.

Thomas

Am 18.09.2020 um 09:54 schrieb Wayne VanWeerthuizen <waynemv@...>:

"Note that Wayne always uses a comma like 'MIN(A,B)' where a semicolon is needed!"

Sorry! I'll fix that in the next revision.

Funny how in that same tutorial I warned that using commas instead of semicolons is a common source of errors. Maybe I meant my own errors in particular, LOL.