This is the IMDb data newsletter, published every 6-8 weeks. To unsubscribe, send a message to data-news-unsubscribe@mlists.imdb.com. To subscribe, send a message to data-news-subscribe@mlists.imdb.com. You can also use the signup page at http://www.imdb.com/maillists . Feedback on these articles or suggestions for new topics are welcome; contact dnews@imdb.com. The most interesting questions will be used in the next issue. Why this newsletter ------------------- As our submission volume continues to increase, we see a number of common problems, many of which cause data to be rejected. Unfortunately, due to the sheer volume of submissions, we can't tell everyone when we have to reject data. In addition, from time to time, we change a policy, and there was no way to let our submitters know. Finally, while we do have a feedback address (additions-help@imdb.com) for problems with the submission process, there was no good place for feedback on general data policy issues. We decided that reviving our old newsletter, which had fallen by the wayside in the wake of launching our daily newsletter, was the best approach. In future issues, we plan to include more tutorials, answer some of the best questions submitted by our contributors, and possibly pose some research challenges. For this issue, we've got a lot of news to cover. In this issue ------------- - 2001 in review - Roman numerals in alternate names - Backlog on the "TGQ" lists - How to get your goof accepted - Title display on IMDbPro - Running times without countries - Plots and biographies - Soundtrack submissions 2001 in review -------------- The year 2001 was again a record year for submissions to the database. We received 6,228,316 lines of data, about a 35% increase from the previous year. Submissions this year are already running 10-20% over last year's weekly average. We also added 25,000 new movies last year, or about 10% growth; we finished the year with over 297,000 titles and have already added 11,000 more, despite more stringent rules for inclusion. Overall filmography data grew by about 22% last year. Our top 2 submitters (aside from IMDb staff) each contributed about 100,000 lines of data. We've been working hard to improve our processing tools to help us keep up with the increase in submissions; we've also added staff both last year and this year. Our profound thanks to all our contributors; you've helped make us the most comprehensive source of movie information anywhere. Roman numerals in alternate names --------------------------------- As experienced contributors are aware, when two people have the same name, we separate their listings by assigning an arbitrary Roman numeral in parentheses to each person. In some cases, we omit the Roman numeral when one person is much more famous than the other (for example, Harrison Ford). A few months ago, we made a change in the way we handle alternate names. In the past, all names had to be unique, which meant alternate names had to include Roman numerals just like primary names. Now, alternate names never include Roman numerals. This can cause some confusion for those alternate names that duplicate primary names that do not include Roman numerals (for example, Steve Allen), so you need to be extra careful in those cases. If you are using the local interfaces, you should be sure you are running version 3.17 (released in November 2001), when partial support for the new policy was added. It's worth noting that names are now managed centrally; in the past, each list manager handled names separately, which could cause problems if a name that appeared on two or more lists needed to be split. The alternate names are outside of that central management system. Backlog on the "TGQ" lists -------------------------- The trivia, goof and quotes lists (or TGQ as we rather snappily call them round here) were a little neglected towards the end of last year and that has created something of a backlog of new additions. But you might have noticed that we're already working again on the trivia and goof lists, and you'll no doubt be delighted to learn that work on the quote list will begin very soon. Nothing has been lost during this brief lull, and nothing will be overlooked now that we're working again, but the nature of the lists means that every item has to be read, checked, and edited by a real live Human Being (you remember "people" - we were very popular in the '70s), and it will take some time to clear the backlog completely. It might take quite a while before we get to your submission, but it's not lost and we shan't ignore it. Just charging headlong at the backlogs and clearing them in the order they had been submitted didn't seem like the best use of resources. We decided that we would work on the backlog by title rather than submission date (i.e., clearing every submission for a title regardless of its age) and that we could provide the best service to the greatest number of our users by focusing on more popular titles (the titles that the greatest number of people look at) first. This doesn't mean that the less frequently-hit titles will be forgotten about, just that it will take a while longer for us to get to them. If you've submitted anything for any of these lists, please be patient with us and try as hard as you can to avoid the temptation to resubmit. Our additions system is remarkably reliable (if a little baffling at times) and things are seldom lost, so if you've sent it, we've got it. How to get your goof accepted ----------------------------- Tim Norris (the TGQ list manager) has this to say on the subject of preparing your goof submissions: I can't guarantee that your goof will appear on the site even if you follow these 10 helpful hints (the list manager's decision is final, no correspondence will be entered into, please keep your feet off the jump seat, your home is at risk if you do not keep up repayments on a loan secured on it, etc) but you can lessen the chances of your efforts being thrown out if you: 1. Think again. Was it really a goof? Did his jacket really disappear, or did he take it off while you were looking down the back of the sofa for the remote? Double check if you can. Ask yourself if it might be a joke (a lot of supposed goofs are actually jokes). Then try to explain it away somehow. Only submit something as a goof when you're absolutely sure it's a goof. 2. Double check "factual errors". Many of the "facts" we get are not, in fact, facts at all. It's not burdensome, and it can often be quicker to check something and find out that you were wrong than to go through our impenetrable additions interface and submit it as a goof, so you might actually be saving yourself time and effort. Only submit something as a goof when you're absolutely sure it's not right. 3. Don't tell us about differences between the movie and the original book, comic book, radio series, TV show, computer game, magazine, beer mat or bubblegum wrapper. These aren't goofs. 4. Read the existing goofs carefully. Have we already got it listed? Only submit something as a goof when you're absolutely sure we haven't already got it. 5. Use characters' names, not actors' names. 6. Check your spelling (especially characters' names, which you should always use instead of actors' names, by the way). Don't worry too much about style and grammar (that's what editors get paid for) but the less work I have to do, the more it's going to look like your submission when you see it online. 7. DON'T SHOUT. And! Don't! Litter! Your! Text! With! Exclamation! Marks! (I really don't like them) 8. Don't go overboard when describing the goof, but do try to give some helpful detail to identify the scene. Not all the detail will be used in the finished version, but the more I've got, the more easily I can understand what you're saying and check your submission. If you think you need to add a time from the DVD version (you really don't have to bother, but some people like to), please let me know which Region version it is - they run at different speeds. 9. Be polite. If I've made a mistake, just tell me and I'll put it right. It's not the end of the world and it can be fixed. It doesn't get done any quicker or any better if you're abusive or snotty, but I do invest a few extra moments in sticking a couple of extra pins into our Rude User voodoo doll. Would your mother approve of your talking to strangers like that? Well then. 10. Relax. This is one of the "fun stuff" lists. Title display on IMDbPro ------------------------ For those of you using IMDbPro, you may have noticed that some titles display differently. Since our Pro customer base is primarily located in the USA, we display USA titles whenever we have them. Therefore, it's important that aka titles be marked accurately with the country whenever possible. At some point in the future, we will allow people to choose their desired country. Running times without countries ------------------------------- In the past, running times always had a country attached, and there could be several conflicting entries. We've now introduced the concept of the default running time which corresponds to the run time of the original release in the country of production. This time is displayed without a country, and importantly, only times differing significantly from the default (owing to censorship or extended versions etc) are now accepted. We discovered that small variations in submitted run times are usually attributed to timing errors or people relying upon third party sources (eg: newspapers) which were rounding times to the nearest 5 minute interval. If a different version of a title has been released in your country with a different run time, wherever possible, please also submit an entry to the alternate versions section explaining the changes. A reminder to contributors in Europe and other regions of the world with a 25 frames/second video system that TV and video recordings will run approx 4% faster than their theatrical release. Please do not submit video/TV run times for titles based on manual timings from home viewings. We will eventually modify the submission process to accept running times without countries specified; in the meantime, use the country of production or first release whenever possible. Plots and biographies --------------------- A reminder that all plot summaries and mini-biographies must be your own original work. We have seen a large number of biographies copied from official web sites or obituaries. If you have permission from an official web site, you need to include a comment to that effect with your submission (or better yet, have someone connected with that site write to us). The same holds for plot summaries: if it's not your original work, it needs to be credited properly and we need to know you have permission. In biographies, please check the "other works" section if appropriate before submitting trivia; items should not appear in both sections, and the "other works" section is preferred when both are possible. Titles of plays and other works should not be submitted in all caps. Soundtrack submissions ---------------------- The soundtrack section includes only information on soundtracks of the movie itself, not soundtrack albums. This is for several reasons. Many times a movie soundtrack is not released separately. The music released on an LP/CD many times does not match the music in the movie. Sometimes, different soundtrack albums are released with different music (often including songs only "inspired by" the movie). The complete soundtrack guidelines can be found at http://www.imdb.com/Guides/soundtracks