I'm not entirely sure I understand your problem. Let me paraphrase and tell me if I am right: you have taken all the entries in the searchable fields and stuck them in one bigger table with a single field, then you index and search that, so that if the search is "first name = Dana OR last name = Watanabe" it turns into a search for "value = Dana" and a search for "value = Watanabe", which is fast, but then there is no way to search for "first name = Dana AND last name = Watanabe"?
I think something like this will do it (note I am not a SQL guy, but I can use google):
SELECT entry_id FROM SecondTable WHERE search_term IN ( 'Dana', 'Watanabe' ) GROUP BY entry_id HAVING COUNT(*) >= 2
That says 1. make a list of all records where search_term is either Dana or Watanabe, 2. sort those records by entry_id, and 3. only list the records where the number of records with than entry ID is >= 2 (I said ">=" instead of "=" so we wouldn't throw out your entry if you also lived on Dana Street.
select * from directory_table where firstname||lastname||phone||address like '%Dana%' and firstname||lastname||phone||address like '%Watanabe%'
in other words, rather than search each field individually for the search term, you just search a big concatenated string for the search term once. I think that would perform better. You could even create an index on the concatenated fields together, I think.
I could imagine that searches for '%Dana%' might perform considerably worse than searches for 'Dana%' - do you know if that's the case? If I were writing the database implementation it would probably be that way, since you can immediately discard most strings as not being 'Dana%' if they don't have a leading D, but to discard it as not being '%Dana%' I have to search all but the last 4 characters of the string. With an index it might be even worse - although it depends how the index was made. You could make an index that was designed to support searches for '%Something%' by including all cylic permutations of the string.
as far as i can tell the things that kill the use of indexes are: starting with a wildcard using more than one indexes with ORs (ANDing indexes is fine, but the databases we're using seems unhappy with the OR)
I would expect starting with a wildcard to be slow for the reasons I stated above. Really the database should have an option for indices to include cyclic permutations so you can then do searches on it with leading wildcards.
I'm not sure why ORing indices would be slow - it seems like that could be done quickly without having to do anything special to the index. Well, databases are mysterious.
What system are you using that doesn't support nested queries?
Comments 19
Reply
someone searches "dana watanabe"
and i want the results of "first name = Dana AND last name = Watanabe" OR "first name = Watanabe AND last name = Dana"
and i want this in SQL not some homebrewed database like i know you would do. i've already suggested that. :)
Reply
SELECT entry_id
FROM SecondTable
WHERE search_term IN ( 'Dana', 'Watanabe' )
GROUP BY entry_id
HAVING COUNT(*) >= 2
That says 1. make a list of all records where search_term is either Dana or Watanabe, 2. sort those records by entry_id, and 3. only list the records where the number of records with than entry ID is >= 2 (I said ">=" instead of "=" so we wouldn't throw out your entry if you also lived on Dana Street.
Reply
Reply
I would ask them, but they seem busy.
Reply
select *
from directory_table
where
firstname||lastname||phone||address like '%Dana%'
and firstname||lastname||phone||address like '%Watanabe%'
in other words, rather than search each field individually for the search term, you just search a big concatenated string for the search term once. I think that would perform better. You could even create an index on the concatenated fields together, I think.
Reply
I could imagine that searches for '%Dana%' might perform considerably worse than searches for 'Dana%' - do you know if that's the case? If I were writing the database implementation it would probably be that way, since you can immediately discard most strings as not being 'Dana%' if they don't have a leading D, but to discard it as not being '%Dana%' I have to search all but the last 4 characters of the string. With an index it might be even worse - although it depends how the index was made. You could make an index that was designed to support searches for '%Something%' by including all cylic permutations of the string.
Reply
starting with a wildcard
using more than one indexes with ORs (ANDing indexes is fine, but the databases we're using seems unhappy with the OR)
Reply
I'm not sure why ORing indices would be slow - it seems like that could be done quickly without having to do anything special to the index. Well, databases are mysterious.
What system are you using that doesn't support nested queries?
Reply
Leave a comment