eBook Konvertierung — calibre 7.8.0 Dokumentation (2024)

Calibre verfügt über ein Konvertierungssystem, das so konzipiert ist, dass es sehr einfach zu bedienen ist.Normalerweise fügen Sie einfach ein Buch zu Calibre hinzu, klicken auf „Konvertieren“ und calibre wird sich bemühen, eine Ausgabe zu erzeugen, die so nah wie möglich an der Eingabe liegt. Calibre akzeptiert jedoch eine sehr große Anzahl von Eingabeformaten, von denen nicht alle so gut wie andere für die Konvertierung in E-Books geeignet sind. Im Falle solcher Eingabeformate oder wenn Sie einfach eine größere Kontrolle über das Konvertierungssystem wünschen, bietet Calibre eine Vielzahl von Optionen zur Feinabstimmung des Konvertierungsprozesses.Beachten Sie jedoch, dass das Konvertierungssystem von Calibre kein Ersatz für einen ausgewachsenen E-Book-Editor ist. Um E-Books zu bearbeiten, empfehle ich, sie zunächst mit Calibre in EPUB oder AZW3 zu konvertieren und dann die Edit book-Funktion zu verwenden, um sie in eine perfekte Form zu bringen.Sie können dann das bearbeitete E-Book in Calibre als Input für die Konvertierung in andere Formate verwenden.

Dieses Dokument bezieht sich hauptsächlich auf die Konvertierungseinstellungen, wie sie im Konvertieren-Dialog unten dargestellt werden. Alle diese Einstellungen sind auch über die Befehlszeilenschnittstelle zur Konvertierung verfügbar, wie in ebook-convert dokumentiert.In Calibre können Sie für jede einzelne Einstellung Hilfe erhalten, indem Sie Ihren Mauszeiger über diese bewegen. Es erscheint dann eine Kurzinfo, welche die Einstellung beschreibt.

eBook Konvertierung — calibre 7.8.0 Dokumentation (1)

Einführung

Zuerst sollten Sie verstehen, wie der Konvertierungsprozess strukturiert ist. Schematisch läuft eine Konvertierung wie folgt ab:

eBook Konvertierung — calibre 7.8.0 Dokumentation (2)

Das Eingabeformat wird zunächst durch das entsprechende Eingabe-Plugin in XHTML umgewandelt. Dieses HTML wird dann transformiert. Im letzten Schritt wird das verarbeitete XHTML durch das entsprechende Output-Plugin in das angegebene Ausgabeformat umgewandelt. Die Ergebnisse der Konvertierung können je nach Eingabeformat sehr unterschiedlich ausfallen. Manche Formate lassen sich viel besser konvertieren als andere. Eine Liste der besten Quellformate für die Konvertierung finden Sie hier.

The transforms that act on the XHTML output are where all the work happens. There are varioustransforms, for example, to insert book metadata as a page at the start of the book,to detect chapter headings and automatically create a Table of Contents, to proportionallyadjust font sizes, et cetera. It is important to remember that all the transforms act on theXHTML output by the Input plugin, not on the input file itself. So, for example, if you ask calibreto convert an RTF file to EPUB, it will first be converted to XHTML internally,the various transforms will be applied to the XHTML and then the Output plugin willcreate the EPUB file, automatically generating all metadata, Table of Contents, et cetera.

You can see this process in action by using the debug option eBook Konvertierung — calibre 7.8.0 Dokumentation (3). Just specify the path toa folder for the debug output. During conversion, calibre will place the XHTML generated bythe various stages of the conversion pipeline in different sub-folders.The four sub-folders are:

Stadien des Konvertierungsprozesses

Ordner

Beschreibung

input

This contains the HTML output by the Input plugin. Use this to debug the Input plugin.

parsed

The result of pre-processing and converting to XHTML the output from the Input plugin. Use to debug structure detection.

Struktur

Post structure detection, but before CSS flattening and font size conversion. Use to debug font size conversion and CSS transforms.

abgeschlossen

Just before the e-book is passed to the Output plugin. Use to debug the Output plugin.

If you want to edit the input document a little before having calibre convert it, the best thing todo is edit the files in the input sub-folder, then zip it up, and use the ZIP file as theinput format for subsequent conversions. To do this use the Edit meta information dialogto add the ZIP file as a format for the book and then, in the top left corner of the conversion dialog,select ZIP as the input format.

This document will deal mainly with the various transforms that operate on the intermediate XHTMLand how to control them. At the end are some tips specific to each input/output format.

Erscheinungsbild

This group of options controls various aspects of the look and feel of the converted e-book.

Schriftarten

Eines der angenehmsten Merkmale der e-reading Erfahrung ist die Möglichkeit Schriftgrößen einfach an individuelle Bedürfnisse und Lichtverhältnisse anpassen zu können. Calibre hat raffinierte Algorithmen um sicher zu stellen, dass alle ausgegebenen Bücher konsistente Schriftgrößen haben, unabhängig davon welche Schriftgrößen im Quelldokument angegeben sind.

The base font size of a document is the most common font size in that document,i.e., the size of the bulk of text in that document. When you specify aBase font size, calibre automatically rescales all font sizes in the documentproportionately, so that the most common font size becomes the specified base font size and otherfont sizes are rescaled appropriately. By choosing a larger base font size, you can make the fontsin the document larger and vice versa. When you set the base font size, for best results, you shouldalso set the font size key.

Normally, calibre will automatically choose a base font size appropriate to the output profile youhave chosen (see Seiteneinrichtung). However, you can override this here in case the default isnot suitable for you.

Die Font size key option ermöglicht Ihnen einzustellen, wie Schriftgrößen außer der Basisschriftgröße skaliert werden. Der Schriftgrößenskalierungsalgorithmus arbeitet mit einem Schriftgrößenschlüssel, der einfach eine kommagetrennte Liste an Schriftgrößen ist. Der Schriftgrößenschlüssel gibt an, wie viele „Stufen“ größer oder kleiner als die Basisschriftgröße eine gegebene Schriftgröße sein soll. Die Idee dahinter ist, dass es eine begrenzte Anzahl an Schriftgrößen in einem Dokument geben sollte. Zum Beispiel eine Größe für den Hauptteil, ein paar Größen für verschiedene Stufen an Überschriften und ein paar Größen für Super-/Subskripts und Fußnoten. Der Schriftgrößenschlüssel ermöglicht Callibre die Schriftgrößen aus den Quelldokumenten in verschiedene Kategorien einzuteilen, die den unterschiedlichen logischen Schriftgrößen entsprechen.

Lassen Sie uns ein Beispiel geben. Angenommen das Quelldokument, dass wir konvertieren wurde von jemandem mit perfektem Sehvermögen erstellt und hat eine Grund-Schriftgröße von 8pt. Dies bedeutet das der Großteil des Texts im Dokument eine Größe von 8pt hat, während Überschriften etwas größer sind (beispielsweise 10 und 12pt) und Fußnoten etwas kleiner mit 6pt. Wenn wir nun die folgenden Einstellungen verwenden:

Base font size : 12ptFont size key : 7, 8, 10, 12, 14, 16, 18, 20

Das Zieldokument wird eine Basisschriftgröße von 12pt, Überschriften von 14 und 16pt und Fußnoten von 8pt haben. Nehmen wir an wir möchten die größte Überschriftengröße hervorheben und die Fußnoten auch etwas vergrößern. Um dies zu erreichen sollte der Schriftgrößenschlüssel wie folgt geändert werden:

New font size key : 7, 9, 12, 14, 18, 20, 22

Die größten Überschriften werden nun 18pt groß, während die Fußnoten 9pt groß werden. Sie können mit diesen Einstellungen herumspielen um zu versuchen ein Optimum zu finden, indem Sie den Schriftgrößenskalierungsdialog nutzen, auf den über die kleine Schaltfläche neben der Font size key Einstellung zugegriffen werden kann.

Die Schriftgrößenskalierung der Konvertierung kann auch deaktiviert werden, falls Sie die Schriftgrößen des Eingabedokuments beibehalten möchten.

A related setting is Line height. Line height controls the vertical height oflines. By default, (a line height of 0), no manipulation of line heights is performed. Ifyou specify a non-default value, line heights will be set in all locations that don’t specify theirown line heights. However, this is something of a blunt weapon and should be used sparingly.If you want to adjust the line heights for some section of the input, it’s better to usethe Extra CSS.

In this section you can also tell calibre to embed any referenced fonts intothe book. This will allow the fonts to work on reader devices even if they arenot available on the device.

Text

Text can be either justified or not. Justified text has extra spaces betweenwords to give a smooth right margin. Some people prefer justified text, othersdo not. Normally, calibre will preserve the justification in the originaldocument. If you want to override it you can use the Textjustification option in this section.

You can also tell calibre to Smarten punctuation which will replaceplain quotes, dashes and ellipses with their typographically correct alternatives.Note that this algorithm is not perfect so it is worth reviewing the results.The reverse, namely, Unsmarted punctuation is also available.

Finally, there is Input character encoding. Older documentssometimes don’t specify their character encoding. When converted, this canresult in non-English characters or special characters like smart quotes beingcorrupted. calibre tries to auto-detect the character encoding of the sourcedocument, but it does not always succeed. You can force it to assume aparticular character encoding by using this setting. cp1252 is a commonencoding for documents produced using Windows software. You should also readWie konvertiere ich meine Datei mit nicht-englischen Zeichen oder Anführungszeichen? for more on encoding issues.

Layout

Normally, paragraphs in XHTML are rendered with a blank line between them and no leading textindent. calibre has a couple of options to control this. Remove spacing between paragraphsforcefully ensure that all paragraphs have no inter paragraph spacing. It also sets the textindent to 1.5em (can be changed) to mark the start of every paragraph.Insert blank line does theopposite, guaranteeing that there is exactly one blank line between each pair of paragraphs.Both these options are very comprehensive, removing spacing, or inserting it for all paragraphs(technically <p> and <div> tags). This is so that you can just set the option and be sure thatit performs as advertised, irrespective of how messy the input file is. The one exception iswhen the input file uses hard line breaks to implement inter-paragraph spacing.

If you want to remove the spacing between all paragraphs, except a select few, don’t use theseoptions. Instead add the following CSS code to Extra CSS:

p, div { margin: 0pt; border: 0pt; text-indent: 1.5em }.spacious { margin-bottom: 1em; text-indent: 0pt; }

Then, in your source document, mark the paragraphs that need spacing with class=“spacious“.If your input document is not in HTML, use the Debug option, described in the Introduction to get HTML(use the input sub-folder).

Another useful options is Linearize tables. Some badly designeddocuments use tables to control the layout of text on the page. When convertedthese documents often have text that runs off the page and other artifacts.This option will extract the content from the tables and present it in a linearfashion. Note that this option linearizes all tables, so only use it if youare sure the input document does not use tables for legitimate purposes, likepresenting tabular information.

Styling

The Extra CSS option allows you to specify arbitrary CSS that willbe applied to all HTML files in the input. This CSS is applied with very highpriority and so should override most CSS present in the input documentitself. You can use this setting to fine tune the presentation/layout of yourdocument. For example, if you want all paragraphs of class endnote to beright aligned, just add:

.endnote { text-align: right }

or if you want to change the indentation of all paragraphs:

p { text-indent: 5mm; }

Extra CSS is a very powerful option, but you do need an understanding of how CSS worksto use it to its full potential. You can use the debug pipeline option described above to see whatCSS is present in your input document.

A simpler option is to use Filter style information. This allowsyou to remove all CSS properties of the specified types from the document. Forexample, you can use it to remove all colors or fonts.

Styles umwandeln

This is the most powerful styling related facility. You can use it to definerules that change styles based on various conditions. For example you can useit to change all green colors to blue, or remove all bold styling from the textor color all headings a certain color, etc.

Transform HTML

Similar to transform styles, but allows you to make changes to the HTML contentof the book. You can replace one tag with another, add classes or otherattributes to tags based on their content, etc.

Seiteneinrichtung

The Page setup options are for controlling screen layout, likemargins and screen sizes. There are options to setup page margins, which willbe used by the output plugin, if the selected output format supports pagemargins. In addition, you should choose an Input profile and an output profile.Both sets of profiles basically deal with how to interpret measurements in theinput/output documents, screen sizes and default font rescaling keys.

If you know that the file you are converting was intended to be used on aparticular device/software platform, choose the corresponding input profile,otherwise just choose the default input profile. If you know the files you areproducing are meant for a particular device type, choose the correspondingoutput profile. Otherwise, choose one of the Generic output profiles. If youare converting to MOBI or AZW3 then you will almost always want to choose oneof the Kindle output profiles. Otherwise, your best bet for modern E-bookreading devices is to choose the Generic e-ink HD output profile.

The output profile also controls the screen size. This will cause, for example,images to be auto-resized to be fit to the screen in some output formats. Sochoose a profile of a device that has a screen size similar to your device.

Heuristische Verarbeitung

Heuristic processing provides a variety of functions which can be used to try and detect and correctcommon problems in poorly formatted input documents. Use these functions if your input document suffersfrom poor formatting. Because these functions rely on common patterns, be aware that in some cases anoption may lead to worse results, so use with care. As an example, several of these options willremove all non-breaking-space entities, or may include false positive matches relating to the function.

Enable heuristic processing

This option activates calibre’s Heuristic processing stage of the conversion pipeline.This must be enabled in order for various sub-functions to be applied

Unwrap lines

Enabling this option will cause calibre to attempt to detect and correct hard line breaks that existwithin a document using punctuation clues and line length. calibre will first attempt to detect whetherhard line breaks exist, if they do not appear to exist calibre will not attempt to unwrap lines. Theline-unwrap factor can be reduced if you want to ‚force‘ calibre to unwrap lines.

Line-unwrap factor

This option controls the algorithm calibre uses to remove hard line breaks. For example, if the value of thisoption is 0.4, that means calibre will remove hard line breaks from the end of lines whose lengths are lessthan the length of 40% of all lines in the document. If your document only has a few line breaks which needcorrection, then this value should be reduced to somewhere between 0.1 and 0.2.

Detect and markup unformatted chapter headings and sub headings

If your document does not have chapter headings and titles formatted differently from the rest of the text,calibre can use this option to attempt to detect them and surround them with heading tags. <h2> tags are usedfor chapter headings; <h3> tags are used for any titles that are detected.

This function will not create a TOC, but in many cases it will cause calibre’s default chapter detection settingsto correctly detect chapters and build a TOC. Adjust the XPath under Structure detection if a TOC is not automaticallycreated. If there are no other headings used in the document then setting „//h:h2“ under Structure detection wouldbe the easiest way to create a TOC for the document.

The inserted headings are not formatted, to apply formatting use the Extra CSS option underthe Look and Feel conversion settings. For example, to center heading tags, use the following:

h2, h3 { text-align: center }
Renumber sequences of <h1> or <h2> tags

Some publishers format chapter headings using multiple <h1> or <h2> tags sequentially.calibre’s default conversion settings will cause such titles to be split into two pieces. This optionwill re-number the heading tags to prevent splitting.

Delete blank lines between paragraphs

This option will cause calibre to analyze blank lines included within thedocument. If every paragraph is interleaved with a blank line, thencalibre will remove all those blank paragraphs. Sequences of multipleblank lines will be considered scene breaks and retained as a singleparagraph. This option differs from the Remove paragraphspacing option under Look and Feel in that it actuallymodifies the HTML content, while the other option modifies the documentstyles. This option can also remove paragraphs which were inserted usingcalibre’s Insert blank line option.

Ensure scene breaks are consistently formatted

With this option calibre will attempt to detect common scene-break markers and ensure that they are center aligned.‚Soft‘ scene break markers, i.e. scene breaks only defined by extra white space, are styled to ensure that theywill not be displayed in conjunction with page breaks.

Replace scene breaks

If this option is configured then calibre will replace scene break markers it finds with the replacement text specified by theuser. Please note that some ornamental characters may not be supported across all reading devices.

In general you should avoid using HTML tags, calibre will discard any tags and use pre-defined markup. <hr />tags, i.e. horizontal rules, and <img> tags are exceptions. Horizontal rules can optionally be specified with styles, if youchoose to add your own style be sure to include the ‚width‘ setting, otherwise the style information will be discarded. Imagetags can used, but calibre does not provide the ability to add the image during conversion, this must be done after the fact usingthe ‚Edit book‘ feature.

Example image tag (place the image within an ‚Images‘ folder inside the EPUB after conversion):

<img style=“width:10%“ src=“../Images/scenebreak.png“ />

Example horizontal rule with styles:

<hr style=“width:20%;padding-top: 1px;border-top: 2px ridge black;border-bottom: 2px groove black;“/>

Remove unnecessary hyphens

calibre will analyze all hyphenated content in the document when this option is enabled. The document itself is usedas a dictionary for analysis. This allows calibre to accurately remove hyphens for any words in the document in any language,along with made-up and obscure scientific words. The primary drawback is words appearing only a single time in the documentwill not be changed. Analysis happens in two passes, the first pass analyzes line endings. Lines are only unwrapped if theword exists with or without a hyphen in the document. The second pass analyzes all hyphenated words throughout the document,hyphens are removed if the word exists elsewhere in the document without a match.

Italicize common words and patterns

When enabled, calibre will look for common words and patterns that denote italics and italicize them. Examples are common textconventions such as ~word~ or phrases that should generally be italicized, e.g. latin phrases like ‚etc.‘ or ‚et cetera‘.

Replace entity indents with CSS indents

Some documents use a convention of defining text indents using non-breaking space entities. When this option is enabled calibre willattempt to detect this sort of formatting and convert them to a 3% text indent using CSS.

These options are useful primarily for conversion of PDF documents or OCR conversions, though they canalso be used to fix many document specific problems. As an example, some conversions can leaves behind pageheaders and footers in the text. These options use regular expressions to try and detect headers, footers,or other arbitrary text and remove or replace them. Remember that they operate on the intermediate XHTML producedby the conversion pipeline. There is a wizard to help you customize the regular expressions foryour document. Click the magic wand beside the expression box, and click the ‚Test‘ button after composingyour search expression. Successful matches will be highlighted in Yellow.

The search works by using a Python regular expression. All matched text is simply removed fromthe document or replaced using the replacement pattern. The replacement pattern is optional, if left blankthen text matching the search pattern will be deleted from the document. You can learn more about regular expressionsand their syntax at Alles über die Verwendung von regulären Ausdrücken in Calibre.

Struktur-Erkennung

Structure detection involves calibre trying its best to detect structural elements in the input document, when they are notproperly specified. For example, chapters, page breaks, headers, footers, etc. As you can imagine, this process varies widelyfrom book to book. Fortunately, calibre has very powerful options to control this. With power comes complexity, but if once youtake the time to learn the complexity, you will find it well worth the effort.

Kapitel und Seitenumbrüche

calibre has two sets of options for chapter detection and inserting page breaks. This can sometimes beslightly confusing, as by default, calibre will insert page breaks before detected chapters as well as the locations detected bythe page breaks option. The reason for this is that there are often location where page breaks should be inserted that are notchapter boundaries. Also, detected chapters can be optionally inserted into the auto generated Table of Contents.

calibre uses XPath, a powerful language to allow the user to specify chapter boundaries/page breaks. XPath can seem a little dauntingto use at first, fortunately, there is a XPath tutorial in the User Manual. Remember that Structure detectionoperates on the intermediate XHTML produced by the conversion pipeline. Use the debug option described in theEinführung to figure out the appropriate settings for your book. There is also a button for a XPath wizardto help with the generation of simple XPath expressions.

By default, calibre uses the following expression for detecting chapters:

//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+', 'i')) or @class = 'chapter']

This expression is rather complex, because it tries to handle a number of common cases simultaneously. What it meansis that calibre will assume chapters start at either <h1> or <h2> tags that have any of the words(chapter, book, section or part) in them or that have the class=“chapter“ attribute.

A related option is Chapter mark, which allows you to control what calibre does when it detects a chapter. By default,it will insert a page break before the chapter. You can have it insert a ruled line instead of, or in addition to the page break.You can also have it do nothing.

Die Standardeinstellung zur Erkennung von Seitenumbrüchen ist:

//*[name()='h1' or name()='h2']

which means that calibre will insert page breaks before every <h1> and <h2> tag by default.

Bemerkung

The default expressions may change depending on the input format you are converting.

Verschiedenes

In dieser Sektion gibt es noch einige weitere Optionen.

Insert metadata as page at start of book

One of the great things about calibre is that it allows you to maintain very complete metadataabout all of your books, for example, a rating, tags, comments, etc. This option will createa single page with all this metadata and insert it into the converted e-book, typically justafter the cover. Think of it as a way to create your own customised book jacket.

Erstes Bild entfernen

Manche Quelldateien, die Sie konvertieren, enthalten das Titelbild als Teil des Buches, anstatt als separates Bild. Falls Sie ein weiteres Titelbild in Calibre angeben, hat das konvertierte Buch zwei Titelbilder. Diese Option entfernt einfach das erste Bild des Quelldokuments und stellt auf diese Weise sicher, dass das konvertierte Buch nur ein einziges Titelbild hat, und zwar das welches in Calibre angegeben wurde.

Inhaltsverzeichnis

When the input document has a Table of Contents in its metadata, calibre will just use that. However,a number of older formats either do not support a metadata based Table of Contents, or individualdocuments do not have one. In these cases, the options in this section can help you automaticallygenerate a Table of Contents in the converted e-book, based on the actual content in the input document.

Bemerkung

Using these options can be a little challenging to get exactly right.If you prefer creating/editing the Table of Contents by hand, convert tothe EPUB or AZW3 formats and select the checkbox at the bottom of the Tableof Contents section of the conversion dialog that saysManually fine-tune the Table of Contents after conversion.This will launch the ToC Editor tool after the conversion. It allows you tocreate entries in the Table of Contents by simply clicking the place in thebook where you want the entry to point. You can also use the ToC Editor byitself, without doing a conversion. Go to Preferences → Interface → Toolbarsand add the ToC Editor to the main toolbar. Then just select the book youwant to edit and click the ToC Editor button.

The first option is Force use of auto-generated Table of Contents. By checking this optionyou can have calibre override any Table of Contents found in the metadata of the input document with theauto generated one.

The default way that the creation of the auto generated Table of Contents works is that, calibre will first tryto add any detected chapters to the generated table of contents. You can learn how to customize the detection of chaptersin the Struktur-Erkennung section above. If you do not want to include detected chapters in the generatedtable of contents, check the Do not add detected chapters option.

If less than the Chapter threshold number of chapters were detected, calibre will then add any hyperlinksit finds in the input document to the Table of Contents. This often works well: many input documents include ahyperlinked Table of Contents right at the start. The Number of links option can be used to controlthis behavior. If set to zero, no links are added. If set to a number greater than zero, at most that number of linksis added.

calibre will automatically filter duplicates from the generated Table of Contents. However, if there are some additionalundesirable entries, you can filter them using the TOC Filter option. This is a regular expression thatwill match the title of entries in the generated table of contents. Whenever a match is found, it will be removed.For example, to remove all entries titles „Next“ or „Previous“ use:

Next|Previous

The Level 1,2,3 TOC options allow you to create a sophisticated multi-level Table of Contents.They are XPath expressions that match tags in the intermediate XHTML produced by the conversion pipeline. See theEinführung for how to get access to this XHTML. Also read the XPath-Tutorial, to learnhow to construct XPath expressions. Next to each option is a button that launches a wizard to help with the creationof basic XPath expressions. The following simple example illustrates how to use these options.

Suppose you have an input document that results in XHTML that look like this:

<html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Sample document</title> </head> <body> <h1>Chapter 1</h1> ... <h2>Section 1.1</h2> ... <h2>Section 1.2</h2> ... <h1>Chapter 2</h1> ... <h2>Section 2.1</h2> ... </body></html>

Then, we set the options as:

Level 1 TOC : //h:h1Level 2 TOC : //h:h2

This will result in an automatically generated two level Table of Contents that looks like:

Chapter 1 Section 1.1 Section 1.2Chapter 2 Section 2.1

Warnung

Not all output formats support a multi level Table of Contents. You should first try with EPUB output. If thatworks, then try your format of choice.

Verwenden von Bildern als Kapitelüberschriften beim Konvertieren von HTML-Quelldokumenten

Suppose you want to use an image as your chapter title, but still want calibre to be able to automatically generate a Table of Contents for you from the chapter titles.Use the following HTML markup to achieve this:

<html> <body> <h2>Chapter 1</h2> <p>chapter 1 text...</p> <h2 title="Chapter 2"><img src="chapter2.jpg" /></h2> <p>chapter 2 text...</p> </body></html>

Set the Level 1 TOC setting to //h:h2. Then, for chapter two, calibre will take the title from the value of the title attribute on the <h2> tag, since the tag has no text.

Verwenden von Tag-Attributen als Text für Inhaltsverzeichniseinträge

If you have particularly long chapter titles and want shortened versions in theTable of Contents, you can use the title attribute to achieve this, forexample:

<html> <body> <h2 title="Chapter 1">Chapter 1: Some very long title</h2> <p>chapter 1 text...</p> <h2 title="Chapter 2">Chapter 2: Some other very long title</h2> <p>chapter 2 text...</p> </body></html>

Set the Level 1 TOC setting to //h:h2/@title. Then calibre willtake the title from the value of the title attribute on the <h2> tags,instead of using the text inside the tag. Note the trailing /@title on theXPath expression, you can use this form to tell calibre to get the text from anyattribute you like.

How options are set/saved for conversion

There are two places where conversion options can be set in calibre. The first isin Preferences->Conversion. These settings are the defaults for the conversionoptions. Whenever you try to convert a new book, the settings set here will beused by default.

You can also change settings in the conversion dialog for each book conversion.When you convert a book, calibre remembers the settings you used for that book,so that if you convert it again, the saved settings for the individual bookwill take precedence over the defaults set in Preferences. You canrestore the individual settings to defaults by using the Restore defaultsbutton in the individual book conversion dialog. You can remove thesaved settings for a group of books by selecting all the books and thenclicking the Edit metadata button to bring up the bulk metadataedit dialog, near the bottom of the dialog is an option to remove storedconversion settings.

When you bulk convert a set of books, settings are taken in the following order (last one wins):

  • From the defaults set in Preferences->Conversion

  • From the saved conversion settings for each book being converted (ifany). This can be turned off by the option in the top left corner of theBulk conversion dialog.

  • From the settings set in the Bulk conversion dialog

Note that the final settings for each book in a Bulk conversion will be savedand re-used if the book is converted again. Since the highest priority in BulkConversion is given to the settings in the Bulk conversion dialog, these willoverride any book specific settings. So you should only bulk convert bookstogether that need similar settings. The exceptions are metadata and inputformat specific settings. Since the Bulk conversion dialog does not havesettings for these two categories, they will be taken from book specificsettings (if any) or the defaults.

Bemerkung

You can see the actual settings used during any conversion by clicking the rotating icon in the lower right cornerand then double clicking the individual conversion job. This will bring up a conversion logthat will contain the actual settings used, near the top.

Formatspezifische Tipps

Here you will find tips specific to the conversion of particular formats. Options specific to particularformat, whether input or output are available in the conversion dialog under their own section, for exampleTXT input or EPUB output.

Microsoft Word-Dokumente konvertieren

calibre can automatically convert .docx files created by Microsoft Word 2007 andnewer. Just add the file to calibre and click convert.

Bemerkung

There is a demo .docx filethat demonstrates the capabilities of the calibre conversion engine. Justdownload it and convert it to EPUB or AZW3 to see what calibre can do.

calibre will automatically generate a Table of Contents based on headings if you markyour headings with the Heading 1, Heading 2, etc. styles in Microsoft Word. Openthe output e-book in the calibre E-book viewer and click the Table of Contents buttonto view the generated Table of Contents.

Ältere .doc-Dateien

For older .doc files, you can save the document as HTML with Microsoft Wordand then convert the resulting HTML file with calibre. When saving asHTML, be sure to use the „Save as Web Page, Filtered“ option as this willproduce clean HTML that will convert well. Note that Word produces really messyHTML, converting it can take a long time, so be patient. If you have a newerversion of Word available, you can directly save it as .docx as well.

Another alternative is to use the free LibreOffice. Open your .doc file inLibreOffice and save it as .docx, which can be directly converted in calibre.

TXT-Dokumente konvertieren

TXT documents have no well defined way to specify formatting like bold, italics, etc, or documentstructure like paragraphs, headings, sections and so on, but there are a variety of conventions commonlyused. By default calibre attempts automatic detection of the correct formatting and markup based on thoseconventions.

TXT input supports a number of options to differentiate how paragraphs are detected.

Paragraph style: Auto

Analyzes the text file and attempts to automatically determine how paragraphs are defined. Thisoption will generally work fine, if you achieve undesirable results try one of the manual options.

Paragraph style: Block

Assumes one or more blank lines are a paragraph boundary:

This is the first.This is thesecond paragraph.
Paragraph style: Single

Nimmt an, dass jede Zeile ein Absatz ist:

This is the first.This is the second.This is the third.
Paragraph style: Print

Assumes that every paragraph starts with an indent (either a tab or 2+ spaces). Paragraphs end whenthe next line that starts with an indent is reached:

 This is thefirst. This is the second. This is thethird.
Paragraph style: Unformatted

Assumes that the document has no formatting, but does use hard line breaks. Punctuationand median line length are used to attempt to re-create paragraphs.

Formatting style: Auto

Attempts to detect the type of formatting markup being used. If no markup is used then heuristicformatting will be applied.

Formatting style: Heuristic

Analyzes the document for common chapter headings, scene breaks, and italicized words and applies theappropriate HTML markup during conversion.

Formatting style: Markdown

calibre also supports running TXT input though a transformation preprocessor known as Markdown. Markdownallows for basic formatting to be added to TXT documents, such as bold, italics, section headings, tables,lists, a Table of Contents, etc. Marking chapter headings with a leading # and setting the chapter XPath detectionexpression to „//h:h1“ is the easiest way to have a proper table of contents generated from a TXT document.You can learn more about the Markdown syntax at daringfireball.

Formatting style: None

Applies no special formatting to the text, the document is converted to HTML with no other changes.

PDF-Dokumente konvertieren

PDF documents are one of the worst formats to convert from. They are a fixed page size and text placement format.Meaning, it is very difficult to determine where one paragraph ends and another begins. calibre will try to unwrapparagraphs using a configurable, Line un-wrapping factor. This is a scale used to determine the lengthat which a line should be unwrapped. Valid values are a decimalbetween 0 and 1. The default is 0.45, just under the median line length. Lower this value to include moretext in the unwrapping. Increase to include less. You can adjust this value in the conversion settings under PDF Input.

Also, they often have headers and footers as part of the document that will become included with the text.Use the Search and replace panel to remove headers and footers to mitigate this issue. If the headers and footers are notremoved from the text it can throw off the paragraph unwrapping. To learn how to use the header and footer removal options, readAlles über die Verwendung von regulären Ausdrücken in Calibre.

Some limitations of PDF input are:

  • Complex, multi-column, and image based documents are not supported.

  • Extraction of vector images and tables from within the document is also not supported.

  • Some PDFs use special glyphs to represent ll or ff or fi, etc. Conversion of these may or may not work depending on just how they are represented internally in the PDF.

  • Links und Inhaltsverzeichnisse werden nicht unterstützt

  • PDFs that use embedded non-Unicode fonts to represent non-English characters will result in garbled output for those characters

  • Some PDFs are made up of photographs of the page with OCRed text behind them. In such cases calibre uses the OCRed text, which can be very different from what you see when you view the PDF file

  • PDFs that are used to display complex text, like right to left languages and math typesetting will not convert correctly

To re-iterate PDF is a really, really bad format to use as input. If you absolutely must use PDF, then be prepared for anoutput ranging anywhere from decent to unusable, depending on the input PDF.

Comic book collections

A comic book collection is a .cbc file. A .cbc file is a ZIP file that contains other CBZ/CBR files. In addition the.cbc file must contain a simple text file called comics.txt, encoded in UTF-8. The comics.txt file must containa list of the comics files inside the .cbc file, in the form filename:title, as shown below:

one.cbz:Chapter Onetwo.cbz:Chapter Twothree.cbz:Chapter Three

Die .cbc Datei enthält dann:

comics.txtone.cbztwo.cbzthree.cbz

calibre will automatically convert this .cbc file into a e-book with a Table of Contents pointing to each entry in comics.txt.

EPUB advanced formatting demo

Various advanced formatting for EPUB files is demonstrated in this demo file.The file was created from hand coded HTML using calibre and is meant to be used as a template for your own EPUB creation efforts.

The source HTML it was created from is available demo.zip. The settings used to create theEPUB from the ZIP file are:

ebook-convert demo.zip .epub -vv --authors "Kovid Goyal" --language en --level1-toc '//*[@class="title"]' --disable-font-rescaling --page-breaks-before / --no-default-epub-cover

Note that because this file explores the potential of EPUB, most of the advanced formatting is not going to work on readers less capable than calibre’s built-in EPUB viewer.

ODT-Dokumente konvertieren

calibre can directly convert ODT (OpenDocument Text) files. You should use styles to format your document and minimize the use of direct formatting.When inserting images into your document you need to anchor them to the paragraph, images anchored to a page will all end up in the front of the conversion.

To enable automatic detection of chapters, you need to mark them with the built-in styles called Heading 1,Heading 2, …, Heading 6 (Heading 1 equates to the HTML tag <h1>,Heading 2 to <h2>, etc).When you convert in calibre you can enter which style you used into the Detect chapters at box.Example:

  • If you mark Chapters with style Heading 2, you have to set the ‚Detect chapters at‘ box to //h:h2

  • For a nested TOC with Sections marked with Heading 2 and the Chapters marked with Heading 3 you need to enter //h:h2|//h:h3. On the Convert - TOC page set the Level 1 TOC box to //h:h2 and the Level 2 TOC box to //h:h3.

Well-known document properties (Title, Keywords, Description, Creator) are recognized and calibre will use the first image (not to small, and with good aspect-ratio) as the cover image.

There is also an advanced property conversion mode, which is activated by setting the custom property opf.metadata (‚Yes or No‘ type) to Yes in your ODT document (File->Properties->Custom Properties).If this property is detected by calibre, the following custom properties are recognized (opf.authors overrides document creator):

opf.titlesortopf.authorsopf.authorsortopf.publisheropf.pubdateopf.isbnopf.languageopf.seriesopf.seriesindex

In addition to this, you can specify the picture to use as the cover by namingit opf.cover (right click, Picture->Options->Name) in the ODT. If nopicture with this name is found, the ‚smart‘ method is used. As the coverdetection might result in double covers in certain output formats, the processwill remove the paragraph (only if the only content is the cover!) from thedocument. But this works only with the named picture!

To disable cover detection you can set the custom property opf.nocover (‚Yes or No‘ type) to Yes in advanced mode.

Konvertieren zu PDF

The first, most important, setting to decide on when converting to PDF is the pagesize. By default, calibre uses a page size of „U.S. Letter“. You can change thisto another standard page size or a completely custom size in the PDF Outputsection of the conversion dialog. If you are generating a PDF to be used on aspecific device, you can turn on the option to use the page size from theoutput profile instead. So if your output profile is set to Kindle, calibrewill create a PDF with page size suitable for viewing on the small Kindlescreen.

Kopf- und Fußzeilen

Sie können beliebige Kopf- und Fußzeilen auf jeder Seite des PDF einfügen, indem Sie Kopf-und Fußzeilenvorlagen festlegen. Vorlagen sind einfach Schnipsel von HTML Code, die an den Positionen von Kopf- und Fußzeile dargestellt werden. Um zum Beispiel Seitenzahlen zentriert unten auf jeder Seite und in Grün anzuzeigen nutzen Sie die folgende Fußzeilenvorlage:

<footer><div style="margin: auto; color: green">_PAGENUM_</div></footer>

calibre will automatically replace _PAGENUM_ with the current page number. Youcan even put different content on even and odd pages, for example the followingheader template will show the title on odd pages and the author on even pages:

<header style="justify-content: flex-end"> <div class="even-page">_AUTHOR_</div> <div class="odd-page"><i>_TITLE_</i></div></header>

calibre will automatically replace _TITLE_ and _AUTHOR_ withthe title and author of the document being converted. Settingjustify-content to flex-end will cause the text to be rightaligned.

You can also display text at the left and right edges and change the font size,as demonstrated with this header template:

<header style="justify-content: space-between; font-size: smaller"> <div>_TITLE_</div> <div>_AUTHOR_</div></header>

Dies wird den Titel auf der linken Seite und den Autor an der rechten Seite anzeigen, in einer Schriftgröße kleiner als der Haupttext .

Sie können auch den aktuellen Abschnitt in Vorlagen nutzen, wie nachfolgend gezeigt:

<header><div>_SECTION_</div></header>

_SECTION_ is replaced by whatever the name of the current section is. Thesenames are taken from the metadata Table of Contents in the document (the PDFOutline). If the document has no table of contents then it will be replaced byempty text. If a single PDF page has multiple sections, the first section onthe page will be used. Similarly, there is a variable named _TOP_LEVEL_SECTION_that can be used to get the name of the current top-level section.

You can even use JavaScript inside the header and footer templates, forexample, the following template will cause page numbers to start at 4 insteadof 1:

<footer> <div></div> <script>document.currentScript.parentNode.querySelector("div").innerHTML = "" + (_PAGENUM_ + 3)</script></footer>

In addition there are some more variables you can use in the headers andfooters, documented below:

  • _TOTAL_PAGES_ - total number of pages in the PDF file, useful forimplementing a progress counter, for example.

  • _TOP_LEVEL_SECTION_PAGES_ - total number of pages in the current toplevel section

  • _TOP_LEVEL_SECTION_PAGENUM_ - the page number of the current pagewithin the current top level section

Bemerkung

When adding headers and footers make sure you set the page top andbottom margins to large enough values, under the PDF Outputsection of the conversion dialog.

Ausdruckbares Inhaltsverzeichnis

Es kann auch ein ausdruckbares Inhaltsverzeichnis am Ende des PDF eingefügt werden, das die Seitenzahlen für jeden Abschnitt auflistet. Dies ist sehr sinnvoll, wenn Sie die PDF-Datei auf Papier ausdrucken möchten. Falls Sie die PDF-Datei auf einem elektronischen Gerät verwenden möchten, dann bietet die PDF-Gliederung diese Funktionalität und wird standardmäßig erstellt.

You can customize the look of the generated Table of contents by using theExtra CSS conversion setting under the Look & feel part of the conversiondialog. The default CSS used is listed below, simply copy it and make whateverchanges you like.

.calibre-pdf-toc table { width: 100%% }.calibre-pdf-toc table tr td:last-of-type { text-align: right }.calibre-pdf-toc .level-0 { font-size: larger;}.calibre-pdf-toc .level-1 td:first-of-type { padding-left: 1.4em }.calibre-pdf-toc .level-2 td:first-of-type { padding-left: 2.8em }

Custom page margins for individual HTML files

If you are converting an EPUB or AZW3 file with multiple individual HTML filesinside it and you want to change the page margins for a particular HTML fileyou can add the following style block to the HTML file using the calibreE-book editor:

<style>@page { margin-left: 10pt; margin-right: 10pt; margin-top: 10pt; margin-bottom: 10pt;}</style>

Then, in the PDF output section of the conversion dialog, turn on theoption to Use page margins from the document being converted.Now all pages generated from this HTML file will have 10pt margins.

© Copyright Kovid Goyal.Zuletzt aktualisiert am Apr. 05, 2024.

eBook Konvertierung — calibre 7.8.0 Dokumentation (2024)
Top Articles
Latest Posts
Article information

Author: Virgilio Hermann JD

Last Updated:

Views: 5446

Rating: 4 / 5 (61 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Virgilio Hermann JD

Birthday: 1997-12-21

Address: 6946 Schoen Cove, Sipesshire, MO 55944

Phone: +3763365785260

Job: Accounting Engineer

Hobby: Web surfing, Rafting, Dowsing, Stand-up comedy, Ghost hunting, Swimming, Amateur radio

Introduction: My name is Virgilio Hermann JD, I am a fine, gifted, beautiful, encouraging, kind, talented, zealous person who loves writing and wants to share my knowledge and understanding with you.