If you are like me, you accumulate electronic documents. I am constantly wrangling technical references, E-books, notes, etc.

The problem

The problem I found with having many great documents is that the more I get, the more time I spend searching for, maintaining and moving them, and less time actually reading them.

Finding documents

What is the point of saving a document if you can’t find it when you need it or can access it more quickly through an online search? When one has only a handful of documents it is fine to drop them into a documents directory. It does not, however, take too many files before this method becomes unworkable without a robust search tool. While most operating systems provide file search tools, they treat all the text of a document equally. This means that if I am looking for a document by some word, I will be presented with all the documents containing that word, or all the documents with that word in the file name.

Spring Example
If I am searching for a spring planting guide, I do not want to return documents about the Spring1 Java2 framework.

The next logical improvement over using a single documents directory, is to break the documents into a categorized directory tree. This pre-sort improves the precision of a text search because one may target a search to a subset of documents by searching a particular directory rather than an entire documents directory. Using this method one would search in the Spring folder under the Java directory when the goal is to find a document about the Spring Java Framework.

    docs        
      └─technology      
           └─java       
               └─spring       

While the structure is useful, it becomes more time consuming to maintain as it grows.

Maintenance

Using a directory tree to organize documents is a maintenance burden. The directory tree changes and grows more complicated as documents are added. Some changes require that documents be found and moved. This means more time maintaining documents and less time reading them.

Sharing

Sharing documents means making documents available to the device where and when they are needed. Before I found my solution I would share documents two ways.

  • Across multiple computers by using an encrypted cloud based service called Wuala. I have an explanation of Wuala here. It can be used in various ways but in its simplest form it can be thought of as a removable cloud disk. This works well for a small set of documents, such as a collection of technical references, but not for an entire collection.

  • A local web server. I have tried many different web servers, including one I wrote myself. Every web server I used required installation and maintenance as well as time spent maintaining a web-friendly directory. Time spent on web server maintenance is time that could be spent reading.

The Calibre solution

The solution I like is a free3 tool designed for electronic document library management, called Calibre. It boasts the following categories of features4, two of which addressed my problem, the others just gravy!

Library Management

Search with Calibre is powerful and precise because it understands meta-data. Document meta-data exist to describe a document for more meaningful categorization and searching.

Calibre has rich library management, much of which is beyond the scope of the solution and will be a pleasant discovery when you begin using Calibre.

Catalog

A collection of books in Calibre is a catalog. A catalog can be a library or an e-reader device like a Kindle, NOOK, flash drive, etc.

Part of the magic of Calibre’s library management is that one can make multiple libraries. I have a library for all my Mises books, one for technical references, one for fiction, etc. Calibre can make nice CSV or e-book card catalogs.

Document Meta-Data

The next layer of organization is document meta-data. This is simply information that describes the document. This can include the title, date, author, publisher, rating, tags, etc. When Calibre adds a document to the library it reads the meta-data supplied in that document. The meta-data stored in a document can vary by document type, for example a PDF would have meta-data while a plain text document normally would not. This is how I found Calibre in the first place. I had lost my home-grown solution for reading PDF meta-data and was searching for a replacement. During that search I found Calibre! Calibre will not only read meta-data from many document types, but it will let one edit or add meta-data as well. It can even query the web for you to find missing book covers images. One powerful feature is the ability to add custom meta-data, in the form of tags. I use tags to identify documents in ways that are specifically meaningful to me.

Search can use all of this to capture the document you want to read, create a collection for an e-reader, or make it simple to move books to a different library. The search interface even has sortable columns in the search results.

Partitioning by saved search

Just as a directory tree can simplify search by confining the search to a subset of the documents, a Calibre library can be logically partitioned by saving a search. The saved search is a dynamic, logical, library partition. An example would be to create a favorites partition by saving a search for all the books with a favorite tag, or all the books with a five star rating. One could run the saved search then add search terms to the search field, to search the subset of documents returned by the saved search.

Example

Create a search for Java documents, in PDF format that have the word “java” in the title. The terms would look like this

java format:pdf title:"java"

The search could be named java PDFs with java in title.


Search of java PDFs with java in title

One could run the search then refine the search by looking for only those documents that also have Message in the title by adding title:"Message"5

to the search bar.


Add title:Message to search

Content server for online access

Calibre has a built-in web server. I simply turn it on and I have instant access to my collection. This means that there is no separate maintenance burden for the web server. Here is a screenshot.


Calibre content server

E-book conversion

Calibre has a sophisticated mechanism to convert between different formats. This can be used to create an E-book or PDF from your personal notes, or change an E-book from one format to another to for a better reading experience on a particular device. You may even choose to have the same document in several formats, each targeted to a different device.

Syncing to E-book reader devices

One can use Calibre to copy documents onto an E-reader, removable disk, etc. This means that your documents can be used where you want them, not simply well organized in Calibre.

Downloading news from the web and converting it into E-book form

Various periodicals and web sites can be consumed by Calibre and converted to an E-book for reading on an E-book reader. I like this feature and find that I prefer these documents to live in their own library. This can be done using the Calibre interface or from the command line. My recommendation is use the graphic interface for download on demand and the command line for automated use. Please let me know if you would like me to explain how to setup automated news conversion for a particular library.

Comprehensive E-book viewer

There are times when the device where a document is needed and the device where Calibre lives are the same device. If you are there anyway, why not read your E-books from Calibre’s E-book viewer?

Free E-books

Providing a list of some of the best sources for your electronic library is worth its own article, but until then here are a few free E-books to get you started.

You read this far, so please consider signing up for our GrowingLiberty newsletter here.

Good luck and happy reading.


  1. SpringSource.org | Available at: http://www.springsource.org/ [Accessed February 8, 2013]. 
  2. Oracle Technology Network for Java Developers Available at: http://www.oracle.com/technetwork/java/index.html [Accessed February 8, 2013]. 
  3. Free but also supported by user donations. 
  4. calibre - About Available at: http://calibre-ebook.com/about [Accessed February 28, 2013]. 
  5. In the screen-shot, I highlighted the addition in green.