Wednesday, May 17, 2017

Known Potential Errors Not Yet Adjusted For

There are 6 quirks of Tanach I have not yet managed to accommodate in my search program. There are the 5 quirks of the "piska" being in the middle of a possuk, and one quirk where a "sofit" letter appears in the middle of the word, instead of at its end.

There are a few verses in Torah configured peculiarly because each verse actually consists of two verses combined into one. The two parts together make up the whole verse, even though in concept the two verses could be considered logically distinct from the other, but for some reason this is how it is. (The asterisked verse in the illustration shows such an instance.)

This notation, that “there-should-have-been-a-break-between-verses-here” (between component verses), is called a “piska”. In Torah, the piska physically manifests as spatial breaks in written scripture, either as a {פ} or as a {ס}.

There are 3 piskas in Torah, and 2 more in Yehoshua. Here are their locations:

1) beraishis 35:22
2) bamidbar 26:1
3) devarim 2:8
4) yehoshua 8:24
5) yehoshua 4:1

These grammatical quirks result in two component logical segments physically occupying one verse.

When I first designed my database, it never occurred to me two component verses could be "squeezed into" one physical verse.

How woud this effect results? Suppose, for example, you seek a gematria where the result will be a whole possuk. Were any of the piska component verses also hold this value -- I’d miss it -- because I don’t test component verses. I deal only with the whole physical verse.

Ideally component verses should also be subject to individual examination.  But it’s too late to patch this up now to accommodate these possible extra contributions to search results. Even now I wouldn’t know how to design my database without overloading the program with a lot of extra code. It would horribly slow things down to check every verse -- when only 5 verses in all of Tanach have this quirk.

To this minor degree (and to the extent where these may be relevant), my search results would suffer from less than the full set of possible results.

The 6th quirk that might render a wrong result is in Yeshayahu, chapter 9, possuk 6. The first word (לםרבה) has its 2nd letter as a "mem sofit" instead of being written as למרבה.

In the meantime, rather than change the configuration of my entire database as I originally conceived it, as well as the necessary changes in code that would entail, I'd rather just "live with" this source of potential error.

Wednesday, May 10, 2017

New Method of Multi-Word Search Introduced

Rather than complicate the Find a Word method by asking how many words in a verse are being sought, I introduced the new search method Find a Phrase as a separate multi-word method. Nor do I plan to combine them in the future because each method comes with different options.

Unlike the single-word search, this multi-word search has no "reverse order" or "any order" option.

The only proviso for word entry for this new method is to separate the user's words with a space between them. This proviso is always on display just above the user-entry field.

Finally, as of yet there is an important caveat as far as the matched results that are found is concerned, and this caveat is explained in the trailing Post Script.

Example 1:
I read in the Rebbe's Maamarim that "יציאת מצרים" is mentioned in Torah 50 times (the same count as the 50 "שערי בינה"). So how would I locate these 50 mentions? This new method provides a way. I can type in two words; The 1st will be letters from the root verb "to go out", namely, "צא"; And the 2nd word will be the name "מצרים". As follows:

This set of options results in 82 hits. Probably most of the 50 hits show up in the result list this way, although at least 32 will have to be eliminated. Below you see the last 8 hits. Of these, #77 is a false hit and would not count. The others here seem to be hits.

Below, you see the first 8 hits of the above search. The first 5 hits also can be eliminated.

Notice I set the 2nd option to "not consecutive". That's because the verb and its object, the name, needn't be adjacent words. Other words can be wedged in between the verb and the name, as is evident in #8.

But mention of "יציאת מצרים" can also occur where the name precedes the verb. Using only the option set we used above, as well as the user's entry, would certainly miss detection of this less common form of speech.

So below is the other thing you can do to catch more of the 50 mentions this way. Note here the user's two-word entry is reversed from what it was before! This will yield results the above search would miss. Of 7 hits,  #6 (see below) probably also qualifies as one of the 50 mentions.


Example 2:
Suppose you want to locate “יעקב יעקב”, where or if it exists anywhere in Tanach.
Just type in both words, choose among 2 options (= 4 scenarios), and hit GO.

The above set of options and user entry yield the single hit:

Now at version 42.

This method may miss hits if more than one hit in the same possuk is available. Only the first hit in that possuk will be reported while the rest, if they exist, will go unreported. This simplifies the code immensely. Although this method can be tweaked to include all matches, I leave this code in place, as is (at least for now) because it is still a powerful search method.