How is searching by a text key more efficient than a wildcard/regex search?
How is searching by a text key more efficient than a wildcard/regex search?
Isn’t comparing a string of characters like comparing any other string of characters?
Checking to see if one string of letters is an exact match for another string is vastly more computationally faster than checking to see if one string contains another string.
There are many complex programming tricks you can use, but let's go back to my librarian example for the non-technical people.
Let’s say I ask the librarian to get every book that ever mentions “cat”. They would have to go through every page, carefully examining every page to see if it contained that combination of letters. It would take forever.
RegExes (AKA Regular Expressions), depending on their complexity, can make things much worse. You are not just looking for the word “cat” but the word “cat” when it doesn’t come directly after a space and the words right after the following space, don’t include “pics” and … When you take into consideration the additional context and conditions, it just adds more CPU time for each record processed.
Now, let’s say we just asked the librarian to find every book where the first word of the book was “cat”. That task could be completed exponentially faster.
You are no longer looking for a needle in a haystack. You are looking for the haystack where the first piece of straw you pull matches your search.
This is one of the reasons tools like Redis can run so fast (when utilized correctly). Don’t get me wrong, my example is grossly oversimplified, but hopefully it hammers the point home.
This post is part of a series I am doing on how to master search at scale and potentially part of an e-book. If you are interested in getting early access, let me know (Comment, DM, Email, etc).