|
NEWSLETTER #9October 1996this issue edited by Jon Reeves Welcome to issue 9 of the IMDb newsletter. The newsletter is intended to keep database users and contributors informed of the latest developments from the management team. Comments and suggestions are welcome and should be directed to newsletter@imdb.com. Issue 10 is scheduled for mid-November. Apologies for the delay in putting out this issue; vacation schedules and outside activities got in the way (see, we do have lives!). I could also say we were busy celebrating our third anniversary as a web site, except we didn't realize it until after the fact. Yes, IMDb was one of the first 500 web sites (though not at our current location), and it's been around longer than most CD-ROM movie references. If you want to celebrate, October 17 will be the sixth anniversary of the database, originally as a set of UNIX shell scripts. To subscribe to the newsletter, fill out the survey and check the appropriate box.
Contents
FUZZY SEARCHINGby Michel HafnerThe WWW version of our database access software now offers a fuzzy search for name and titles that goes far beyond the old substring and exact search options (which nevertheless remain useful and the default search types). The CPU intensive regular expression search is an additional option. The four basic search types are best (not) used in the following cases: SUBSTRING search is the appropriate type of search if you:
Substring search is the default search type and very useful if you pay attention to picking suitable substrings. EXACT search is the appropriate type of search if you:
FUZZY search is the appropriate type of search if you:
Fuzzy search is a very useful search method if the above cases apply. But keep in mind that fuzzy search is not a more tolerant substring search! If it generally were, it would produce even bigger output than substring search for common/small substrings, which it doesn't. To separate the good matches from the bad matches (e.g., the many matches substring search will give you anyway) it outputs only names/titles whose length is more or less the same as the length of your search string (there are exceptions to this rule if the matched substring is very likely to be relevant despite its length difference to the search string).
So a fuzzy search for title 'Tim' will give you for example REGULAR EXPRESSION search is the appropriate type of search if you:
If you encounter unexpected or incorrect behaviour of fuzzy search, you can drop me a line and report the problem. EFFECTS COMPANY LIST LAUNCHEDby Rob HartillHow did Forrest Gump shake the hand of JFK? It was just an illusion of course, and in this case it was created by the good folks of ILM. You might think that ILM (Industrial Light and Magic) do the effects for all of Hollywood's blockbusters these days; well not quite, there are many other "Special Effects Companies" out there doing a great job and to acknowledge their ever growing contribution to today's movies the IMDb now records their credits. See the extended search form for a good starting place to search or browse this new section. [We've also added Norwegian and Italian title aka lists at the FTP sites.] UPDATE CYCLE CHANGEby Col NeedhamThe database continues to grow at an amazing rate despite the fact that we now have complete information for thousands of movies and people. I'm pleased to say that contributors are finding new areas to research and expand the database (particularly silent movies and non-US releases). However, new data has to be processed and validated and this takes an increasing amount of time as the database grows. Previously additions were distributed to the database editors on Friday for processing over the weekend ready for the site update late on Sunday. This has now been moved to Thursday in order to allow more time and has worked out very well. It means that the best case processing time moves from 2 days to 3 days (and worst from 9 to 10 days) but it means we can still keep on top of everything in weekly cycles. As a side effect of this, the deadline for the template additions interface has been moved from Thursday to Wednesday to allow time for the templates to be processed. Please keep the new information pouring in and help the database to grow. PLOT SUMMARIES WANTEDby Col TintoAs ever, we need your summaries! Starting to scrape the bottom of the 'most voted for' movies list now, but here are another 20 popular movies which as yet don't have summaries.
NEW ADDITIONS GUIDEby Col Needham
A new version of the complete database additions guide was published at
the end of August. A copy is available by sending e-mail to the IMDb
mail-server at ftp://uiarchive.cso.uiuc.edu/pub/info/imdb/tools/additions-guide.gz or any of the other IMDb ftp sites. There are several changes, but most notably a new policy on uncredited appearances. All uncredited appearances must now be tagged with the attribute (uncredited) whether it be a cameo from a major star in a recent movie to a bit player in older movies where ususually only the principal cast are credited. Use of this attribute will automatically trigger the removal of the cast order number, thus fixing the problem highlighted by Rod Crawford in the previous newsletter. HOT SEARCHESHere's the most popular searches people have done lately, based on total pages for the week ending September 28. Titles:
ID4 continues strong, though without the commanding lead it had last month. Star Wars climbs from number 10. Besides the usual new releases, surprising showings by Naniwa Ereji (aka Osaka Trilogy) and a 1944 Swedish title. People:
Margaret Colin drops off the top 150 completely (how quickly they forget); Jeff Goldblum plummets to #102. And Will Smith may have helped him save the world, but he's already down to #39. Kevin Costner, Sharon Stone, and Jennifer Connelly just missed the cut this month. Michelle Pfeiffer's latest movie opens in a couple weeks; should raise her standing even more. At least Sean Connery gives hope to us balding males. Number 1 Hatcher was visited about 5 times as often as Arnold. Joan Bud (who?) was #30. HOT MOVIESby Col NeedhamMovies opening in the US in August/September sorted by number of votes (to September 26th):
Movies opening in the US in August/September sorted by average votes (to September 26th):
IMDb IN THE NEWSby Jon ReevesJust a few of the traditional media outlets that have mentioned us lately: The Net (US). Newsday. Yahoo! Internet Life (September *and* October). Boston Globe. Utne Reader. Sight and Sound. KFBK Radio, Sacramento. P.O.V. Magazine. I-way 500 (best Leisure site). WebSight (months ago; we just found out). BBC Radio 1. Library Journal. We're particularly proud of the review in Yahoo! Internet Life (Sept.), where the two best known US movie reviewers, Roger Ebert and Gene Siskel, both gave us a thumbs up. We've also won several new awards. See selections from the gallery here. NetBest Awards (finalist). Awesome Universal t@p 500 WebSites. Access to the World Cool Link of the Week. Computer Currents Interactive Link of the Week. Komputer Klinic Kool Site. (WFMM) Cool Site o' the Day. USA Today Hot Site. P.O.V. Top 100 (#53). I-way 500 (best Leisure site). Top Shopping Site: All Internet Shopping Directory. And a web-related mention of note: the hot100 list shows us as the eighth hottest site on the whole net. WEB SERVER CHANGESby Rob HartillSince the last newsletter, a lot of coffee has been consumed and in between the trips to the kettle some new code has been added. Elsewhere in this newsletter you can read about Michel Hafner's new fuzzy matching code (written in C you know!, how did he slip that by my perl-only filter?). Other changes include much more online checking of web based submission to try to clean more of the data we receive before it even leaves your web browser. All the extra checks and warnings might frustrate at first, but we hope they remind you how best to submit clean data that can be added sooner... all those warnings used to be fixed by us manually :-( The old style quiz has been put to rest and replaced with a new quiz. At the moment it comes in two varieties: (1) a name guessing, based on a hangman like game and (2) a multiple choice quick quiz (in the same style as the old quiz) with questions that will be designed to tease and educate. If you have some devious questions that you'd like to add to the quiz, send them to me please. The main search form now allows searching of 'business', 'goofs', 'technical' and 'trivia' under the 'word search' section, so now you can search for your favourite type of goof or studio filming locations, etc. Behind the scenes, our servers became HTTP/1.1 compliant thanks to our developers version of Apache 1.2 and our Perl became fuel injected thanks to Doug MacEachern's "mod_perl_fast" Apache plug-in module that embeds a Perl interpreter into our Apache server. Look out for translated versions of key pages in the very near future. Using Apache's language negotiation feature we'll soon be serving up some pages in French, German and Italian (to begin with). For user with browsers capable of specifying a preferred language (e.g. Netscape 3 for Win/Mac [not for Unix! sigh]) the new pages will magically appear if you prefer a language other than English and if we have a translation available. Everyone else will continue to see English. XREGAL UPDATESby Lachlan WetherallSince the last newsletter, versions 1.1 and 1.2 of xregal have been released. Xregal is an X11 hypertext interface for the Internet Movie Database when it is installed locally on a Unix host. Apart from a number of bug fixes the main features added from version 1.0 to 1.2 are:
For a full list of changes, consult the ChangeLog file. Version 1.2 of xregal requires the moviedb3.2g package to be installed first. Both xregal and movidedb3.2g are available from the usual IMDB ftp sites:
ftp://uiarchive.cso.uiuc.edu/pub/info/imdb/tools
The latest development version of xregal is always available from the xregal home page. If you have any suggestions on improvements for xregal, drop me an e-mail. Bug reports are especially welcomed and acted on speedily. DATABASE STATISTICSby Jon ReevesThis is a regular section giving information about the current size and growth of the IMDb. We receive between 30,000 and 40,000 additions every week from users all over the world. Number of filmography entries: 1,194,654 Number of people covered: 337,775 Number of movies covered: 84,196 Size of the database (Mb): 97 Recent milestones:
FUTURE DEVELOPMENTSThis is a regular section listing some enhancements we're currently looking at. Please bear in mind that some of these may take quite a while to come to fruition or even fail to materialize because the original volunteer decides not to proceed.
Academy Awards and Oscar are registered trademarks of the Academy of Motion Picture Arts and Sciences. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||