Some Caveats

Our effort to understand Victorian books through a series of graphs about publication titles may lead new vistors to the site to mistakenly believe that we are uncritical quantifiers. But we are professional historians who treasure nuance and sophisticated interpretation, so we want to be explicit about some of our concerns about the limitations of our data and methodologies.

First, we are well aware that the meaning of words change over time, as does word choice. “Science,” for instance, starts the long nineteenth century as an expansive term not so far from “knowledge,” but ends the era with a more narrow focus on the natural sciences. “Evil” might be a theme of Victorian thought but not necessarily the term most frequently used by authors when they discuss the subject.

Second, different ways of viewing our raw data will present different biases. The first study we ran was solely about word use in titles; undoubtedly we’ll get different charts when we run the same terms against the full texts of the same books. Furthermore, we plan to acquire the context around words (what Google calls “snippets”), which will allow for a much more fine-grained analysis, helping to determine whether a word was used in a positive or negative sense. We also recognize that artifacts of publishing shape the charts, and therefore limit the extent to which they can be representative of larger cultural interests.

Finally, we want to emphasize that the methods explored here will complement, rather than replace, close reading. It does not pretend to be a substitute for understanding the nuances of meaning that humanists enjoy. But it does present a new way of looking at the data and suggests some new questions. Indeed, we have already made some unexpected connections between the bird’s eye view of title data, texts of specific books, and the historical context from which they came.

We hope you will take a similarly cautious view of the contents of this site, but also share our enthusiasm for its promise. And of course we welcome and encourage your comments and criticisms.

This entry was posted in about. Bookmark the permalink.

7 Responses to Some Caveats

  1. Kevin Schlottmann says:

    Thanks for posting the data. As always, very interesting. There is one seeming anomaly that jumps out on first reading. If I am looking at the spreadsheet correctly, the database contains 7287 editions published in 1799, followed by a spike to 14753 in 1800 and a drop to 4843 in 1801. A spike also occurs in 1900, although it is relatively not as extreme, and similar but smaller increases are found for every decade.

    Is it possible that when presented with insufficient bibliographic information on publication year, the database defaults to a known value? For example, if the year were only known to be 184X, the edition would be assigned to 1840, and for 18XX, 1800.

    I would be curious what you make of this; there are probably many other ways to explain these spikes, and my apologies if I missed the discussion of this elsewhere.

  2. dan says:

    @Kevin: Yes, that’s basically correct. There are a lot of books in Google’s database that default to 1900, for instance, with poor metadata they acquire from library and library database partners.

    • Kevin Schlottmann says:

      That seems to raise a whole series of questions. Who are these libraries and database partners providing shoddy metadata? Are they providing other bad information to Google? Is Google working with these partners to improve the metadata? Where does the title information analyzed here come from, Google’s scans or the library partners’ records? Does Google stand behind the accuracy of the metadata it creates itself, such as the full-text? Should all of this bring into question the conclusions drawn from this data set?

      I certainly don’t mean to be a gadfly about this; this is sort of study is brings the powers of computing to the humanities. But when one lives by the data, it seems these sorts of questions should be addressed as an integral part of the discussion.

  3. Russ says:

    This is a fantastic new resource. I wonder why there is such a dip for ‘modern’ at the end of the 18th century?

  4. Pingback: Anterotesis » Victorian Books: The Frequency of Revolution

Comments are closed.