A DjVu Decode Tutorial

This tutorial illustrates how to load/save a DjVu file.

Step 1: Initialization

To use DjVu SDK, you firstly include several headers.

#include "djv_document.h" // for Document class
#include "djv_page.h" // for Page class
#include "djv_serialize.h" // for IFF::serialize, \ref IFF::deserialize
using namespace Celartem;
using namespace Celartem::DjVu;

Since the base library is shared with PixelLive SDK, they're in Celartem namespace while DjVu related classes are in Celartem::DjVu namespace.

Step 2: Creating a Storage Instance

In this step, you should create a Storage instance from an DjVu file.

AutoPtr<Storage> storage = Storage::create("myfirst.djvu");

The Storage class is the most easiest one to load files from local disk or a http based web servers. The following code is trying to load image from a web server:

AutoPtr<DiskStorage> storage = Storage::create("http://sample.celartem.com/images/mrsaito.djvu");

There're also other Storage classes. For more information about them, see Storage Class Factory Functions.

Step 3: Creating a Document Instance

You should create a Document instance from the Storage instance.

AutoPtr<Document> doc = Document::create(storage);

But you can also use the following easier way, which does not require you to open Storage instance separately:

AutoPtr<Document> doc = Document::create("http://sample.celartem.com/images/mrsaito.djvu");
Warning
Without setting appropriate license script, decoding function (DjVuDecode feature) will stop working after 2 weeks from the its built time. See License System for more information.

Step 4: Loading a Page

The Document instance usually has several Page instances in it and pages are actually managed like as an array. The following code loads the first page:

AutoPtr<Page> page = doc->getPages()[0];

You can also know how much pages in the Document by the following code:

size_t pageCount = doc->getPages().getSize();

Step 5: Loading the image of the Page

Now, you're ready to load the image data from the Page instance. But first, please let me introduce how to get the page size.

// get the "pixel" size of the page
size_t w = page->getWidth();
size_t h = page->getHeight();

The code above can get the "pixel" size of the page but it's very large to show all the portion in a display. So we should calculate the size which is fit into the display. In this tutorial, I'll introduce a way to get physical size from the page's resolution, known as dot-per-inch.

// here we assume the display is 96dpi.
size_t dw = w * 96 / page->getDpi();
size_t dh = h * 96 / page->getDpi();

OK, now render the pixel image:

size_t stride = dw * 3; // bytes-per-line; a R-G-B pixel consumes 3 bytes
size_t byteSize = stride * dh;
// Firstly allocate memory, which stores the rendered output.
// Since we don't plan to resize the buffer, make the size and the reservation
// size same; the second parameter is the reserved size.
SimpleArray<u8> buffer(byteSize, byteSize);
page->render(
&buffer[0], stride, pmRGB8,
Rect(0, 0, dw, dh), // left, top, width, height of the portion to render
dw, dh);

The 5th,6th parameter of Page::render method specifies the rendering image size in pixels and the 4th paramether specifies which portion of the image is actually stored to the buffer; actual picture size rendered is controled by the 4th parameter.

Step 6: Accessing to the actual chunks

As I've explained in the steps, you can easily render the page image without knowing how the actual data are stored in DjVu file. But for advanced operations, you may have to access the internal data.
DjVu data is actually stored in a structure called chunk and each page is corresponding to a chunk. So you can get the chunk of the page from Page instance:

AutoPtr<Chunk> chunk = page->getChunk();

And basically, Document instance is the actual owner of page chunks. So you can also access chunks from Document instance:

AutoPtr<Chunk> djvmChunk = doc->getChunk();

The chunk returned by Document class is called DJVM, which manages multipage chunks. The most chunks are for retaining pages but some are for special purposes. To get these chunks, do like the following code:

Chunk::Array& chunks = djvmChunk->getChildren();
size_t count = chunks.getSize();
for(size_t i = 0; i < count; i++)
{
AutoPtr<Chunk> chunk = chunks[i];
...
}

Please note that not all the chunks has child chunks and if you call Chunk::getChildren method on a chunk which does not have children, an exception occures. So you had better check whether it has children or not like the following:

if(djvmChunk->isCollection())
{
// yeah, DJVM is really a chunk which contains child chunks!
}
else
{
// it seems it does not have any children.
}

Step 7: Chunk Serialization/Deserialization

As you know, DjVu is actually a bunch of chunks. And we usually load the chunks from DjVu file using IFF::deserialize method like the following:

AutoPtr<Storage> storage = Storage::create("myfirst.djvu");
AutoPtr<Chunk> chunk = IFF::deserialize(storage);

And to save the chunks using IFF::serialize method:

AutoPtr<DiskStorageWithRollback> storageW = DiskStorageWithRollback::create("saved.djvu", accessWrite);
IFF::serialize(storageW, chunk);
storageW->commit();

DiskStorageWithRollback class is a Storage class which provides the rollback feature in case of serialization failure. So you have to call DiskStorageWithRollback::commit method after successful serialization; otherwise the serialization result is to be rollbacked. For more, see StorageRollback class.

Of course, Document class provides a way to save the Document :

AutoPtr<DiskStorageWithRollback> storageW = DiskStorageWithRollback::create("saved.djvu", accessWrite);
doc->save(storageW);
storageW->commit();

The following does almost same thing:

AutoPtr<DiskStorageWithRollback> storageW = DiskStorageWithRollback::create("saved.djvu", accessWrite);
IFF::serialize(storageW, doc->getChunk());
storageW->commit();

When you modify Document instance, you have to synchronize the chunk before directly access it:

doSomeModificationOnDocument(doc);
doc->updateChunks(); // reflect to the chunks
IFF::serialize(storageW, doc->getChunk());

In the same way, when you modify some part of the chunk which is obtained from Document instance, you should synchronize the Document instance before using Document::save method:

doSomeModificationOnChunk(doc->getChunk());
doc->syncToChunks();
doc->save(...);

Cuminas DjVu SDK 3.0.33103
This document is made with doxygen 1.8.5 at Sun Dec 15 2013 19:38:06.
Cuminas Logo