Archive for February, 2009

Comment by jqs on mySQL large text comparisson performance… best practices?

Feb 10 2009 Published by jqs under stack overflow

Nope, full compares against the column, never a word part. I'll look at Sphinx and Lucerne thanks.

Comments Off

jqs: Oh I get this: http://www.gocomics.com/foxtrot/2009/02/08/ do you?

Feb 08 2009 Published by jqs under twitter

jqs: Oh I get this: http://www.gocomics.com/foxtrot/2009/02/08/ do you?

Comments Off

What is an easy way to tell if a list of words are anagrams of each other?

Feb 06 2009 Published by jqs under stack overflow

I was asked this one when I applied for my current job.

Comments Off

mySQL large text comparisson performance… best practices?

Feb 04 2009 Published by jqs under stack overflow

I've got a largish (~1.5M records) table that holds text strings of varying length for which I run queries against looking for matches:

CREATE TABLE IF NOT EXISTS `shingles` (
  `id` bigint(20) NOT NULL auto_increment,
  `TS` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `shingle` varchar(255) NOT NULL,
  `count` int(11) NOT NULL default '0',
  PRIMARY KEY  (`id`),
  KEY `shingle` (`shingle`,`TS`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1571668;

My problem is that I need while I'm doing comparisons against this table I am constantly adding and removing data from it, so maintaining indexes is hard.

I'm looking for best practices for managing the inserts in a timely fashion while maximizing the throughput for the selects. This process is running 24hrs a day and needs to be as quick as possible.

Any help is appreciated.

Update: To clarify, I'm doing one to one matches on the 'shingle' column, not text searches within it.

Comments Off