本の虫: GNU Texinfoのmakeinfoの実装がCからPerlになった

2013-02-18

GNU Texinfoのmakeinfoの実装がCからPerlになった

2013年2月16日に、実に5年ぶりに、GNU Texinfoがバージョン5に更新された。前回の更新は、2008年9月18日のバージョン4.13だ。

2012年中に、Texinfoのうち、従来Cで実装されていたtexi2anyとmakeinfoを、Perlで再実装したそうだ。

Perlで実装したことにより、プログラムは遅くなったが、コードが平易になり、開発に新規参入しやすくなり、拡張もしやすくなったという。また、Unicodeへの対応も、Cでは難しいが、Perlならばとても簡単になる。

まあ、遅くなったといっても、たかだか人の手で書く程度の量のテキストの変換処理であるし、それに大半の人は、texi2anyやmakeinfoを直接使わないし、ドキュメントも含めて完全に自前でソースからビルドするようなシステムだとしても、やはりmakeinfo以外のビルド処理に比べればたかがしれているし、どうでもいいのではないかと思う。

NEWS

The new implementation is in Perl, requiring Perl 5.7.3 (released in March 2002) and its standard Encode module.

The Perl texi2any/makeinfo both replaces and is intended to be (for all practical purposes) upward-compatible with the C makeinfo. It has many new features not in the C makeinfo. For example, cross-manual references are now fully supported, and allows for extensive customization of the HTML output. See the `Generic Translator texi2any' chapter in the manual (among other places) for more about this reimplementation.

The new program is, unfortunately, noticeably slower at present than the C program was. We hope all the many improvements make the new version worthwhile for users nevertheless.

GNU Texinfo 5.0: History

Reimplementing in Perl

In 2012, the C makeinfo was itself replaced by a Perl implementation generically called texi2any. This version supports the same level of output customization as texi2html, an independent program originally written by Lionel Cons, later with substantial work by many others. The many additional features needed to make texi2html a replacement for makeinfo were implemented by Patrice Dumas. The first never-released version of texi2any was based on the texi2html code. That implementation, however, was abandoned in favor of the current program, which parses the Texinfo input into a tree for processing. It still supports nearly all the features of texi2html.

The new Perl program is much slower than the old C program. We hope the speed gap will close in the future, but it may not ever be entirely comparable. So why did we switch? In short, we intend and hope that the present program will be much easier than the previous C implementation of makeinfo to extend to different output styles, back-end output formats, and all other customizations. In more detail:

HTML customization. Many GNU and other free software packages had been happily using the HTML customization features in texi2html for years. Thus, in effect two independent implementations of the Texinfo language had developed, and keeping them in sync was not simple. Adding the HTML customization possible in texi2html to a C program would have been an enormous effort.
Unicode, and multilingual support generally, especially of east Asian languages. Although of course it’s perfectly plausible to write such support in C, in the particular case of makeinfo, it would have been tantamount to rewriting the entire program. In Perl, much of that comes essentially for free.
Additional back-ends. The makeinfo code had grown convoluted to the point where adding a new back-end was quite complex, requiring complex interactions with existing back-ends. In contrast, our Perl implementation provides a clean tree-based representation for all back-ends to work from. People have requested numerous different back-ends (LaTeX, the latest (X)HTML, …), and they will now be much more feasible to implement. Which leads to the last item:
Making contributions easier. In general, due to the cleaner structure, the Perl program should be considerably easier than the C for anyone to read and contribute to, with the resulting obvious benefits.

2013-02-18

GNU Texinfoのmakeinfoの実装がCからPerlになった

No comments: