Attending #phptek – Which Talks…

Apr 25 2009 Published by jqs under Blog

(See what I did there? When this tweets, the hashtag will be picked up!)

So in less than a month I’ll be attending php|tek1 adn its high time I decided what talks I want to be seen at. Erm, I mean, which I want to attend…

Tutorial Day:

I’m torn bewteen MVC Development in PHP and Web Application Security Boot Camp but in the end I think MVC will win out…

PHP Code Review wins in the afternoon. And that evening I’ll be busy attending the ChiSox/Twins game.

Day 1:

Highly Scalable Web Applications

Streaming XML

MySQL Server Performance Tuning

SPL to the Rescue

Getting it Done

Security Centered Design

Day 2:

Exceptional PHP

Desktop RIAs with PHP, HTML and JS in AIR

Seven Steps to Better OOP Code

PHP Database Application Architecture for Scalability and Availability

Bend SQL to Your Will With EXPLAIN

Taking it All Offline with SQL Anywhere

Day 3:

Out with Regex, In with Tokens

Working with Microformats

It looks like I’m going to have a lot of fun and will learn a lot. I’m hoping some of my methods get justified as well by my peers.

Kudos again to my work for sending me on this trip, and to my wife for allowing it!

No responses yet

mySQL large text comparisson performance… best practices?

Feb 04 2009 Published by jqs under stack overflow

I've got a largish (~1.5M records) table that holds text strings of varying length for which I run queries against looking for matches:

CREATE TABLE IF NOT EXISTS `shingles` (
  `id` bigint(20) NOT NULL auto_increment,
  `TS` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
  `shingle` varchar(255) NOT NULL,
  `count` int(11) NOT NULL default '0',
  PRIMARY KEY  (`id`),
  KEY `shingle` (`shingle`,`TS`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=1571668;

My problem is that I need while I'm doing comparisons against this table I am constantly adding and removing data from it, so maintaining indexes is hard.

I'm looking for best practices for managing the inserts in a timely fashion while maximizing the throughput for the selects. This process is running 24hrs a day and needs to be as quick as possible.

Any help is appreciated.

Update: To clarify, I'm doing one to one matches on the 'shingle' column, not text searches within it.

Comments Off