What makes up LuaTeX
How it started
Occasionally the question pops up why we've chosen Lua instead of one of the other popular (interpreted) languages. And why do we use MetaPost and not something else? And why do we stick to the TeX language? Of course the same questions would be asked if we've chose your-favourite-language instead. On this page some details of the development are discussed. Much more motives can be found in publications of user groups, presentations we do at user group meetings, mailing lists, manuals and ConTeXt resources. It is good to keep in mind that extensive language related discussions are wasted on us. If you wonder why? Just transport yourself 50 years in the future and wonder how your current arguments would sound then (and how your code would be reviewed).
As the project originates in the ConTeXt community it makes sense to take a look at what is used there. Right from the start ConTeXt is not just a set of macros but also involves programs that manage the workflow. These programs take care of multipass issues like sorting indexes and made sure that the right number of passes is made.
The first versions were programmed in Modula 2, but when ConTeXt went public and had to run on all platforms we migrated to Perl. We had a short flirt with Lisp dialects but as these were non portable Perl seemed a better choice. This took place around 1995. When a couple of years later we ran into Ruby it was decided to rewrite the scripts into this language, which became our favourite. All kind of helper scripts were added to the ConTeXt distribution.
A persistant problem was that in order to use ConTeXt efficiently one also had to install Ruby. As with Perl, this normally resulted in many more files and occasional compatibility issues were no fun for users. TeX distributions are already large and such a extra dependency is no fun.
When playing with extending the Scite editor Lua came into view. At first I was reluctant to use yet another language but it quickly became clear that this small and efficient language (or subsystem) made the editor more powerful but at only a little overhead. No additional libraries had to be installed.
Although one can program most of what is needed in TeX using the macro language, it sometimes can be handy to have a regular scripting language available. Therefore we started experimenting with embedding Lua in pdfTeX. At that point there was a little bit of access to TeX's internals and we could inject (print) text into TeX's input buffer.
A next step
We presented some of the possibilities at conferences. We also had some discussions among ConTeXt developers and decided to create a team: Hartmut Henkel (who did the first implementation), Taco Hoekwater (experienced TeX coder) and Hans Hagen (main author of ConTeXt). We all have some ideas of where TeX lacks openess and functionality but at the same time realized that adding hard coded solutions was not the way to go. Although at that moment one would expect discussions about where to go and what language to use, that didn't really happen. We'd been around long enough to know where we wanted to go and we already fell for Lua's charm as it somehow fit naturally into TeX and suits our plans.
Although in principle (and on the long run) a complete Lua implementation seems possible we settled on a mix that provides both TeX as we know it and extensibility that suits it. For instance, currently LuaTeX uses the traditional linebreak method of TeX but you can replace it by your own (written in Lua). It makes no sense to hard code your, his, her and my solution.
So, development roughly boils down to:
- revisioning one of the TeX's subsystems (for instance math)
- extending it as minimal as possible (for instance math needs some more functionality and proper math is TeX's trademark)
- isolate the component from the rest as much as possible (untangling global properties)
- provide an interface for extending the functionality and provide means to intercept and replace the component
Of course, as we need to keep the machinery working properly in order to make it useable for real production work, we carefully have to decide what steps to take when. It is not hard to imagine that once this is done we can also replace the main loop and use Lua to glue the components together but if that's going to happen it will propably be after the stable version 1.0 is out. At that point you can decide if you want to run LuaTeX in TeX mode (with a macro package that provides functionality using Lua) or start it in Lua mode (where you can call TeX library functions).
The roadmap on this website details what we do and intend to do and it will be updated regularly. We feel free to add or change objectives so don't pin us on the exact wording.
One can argue that TeX is old and that we need something completely new. However, there is a large body of documents out there and users have written many macro packages that they depend on. So, why abandon something that works well? In most cases the things that we want to do better are hidden from the user anyway. In that respect the Lua in LuaTeX has two faces: users can use the embedded scripting language for any purpose they like without ever touching TeX's internals. Developers on the other hand have access to the internals of the machinery and can replace or extend components. In principle a new input language can be defined, although this will probably be easier in future versions of LuaTeX as eventually it will be just a collection libraries glued together. However, that is not our main purpose as the traditional TeX user just wants TeX. And it's the loyal long term users (and user groups) that got us here in the first place.
The Lua language is developed in an academic setting. Around the time that we ran into this language it had become quite stable after many years of research. Just read the articles by its creators, buy the user manual, and/or meet them in person at some conference and you'll understand the atmosphere and spirit in which this language evolved.
Get us right, we use other languages in our daily work and each has its pros and cons but we feel that Lua is just a good choice for what we want to achieve. If this means that in order to use it you have to learn yet another language, so be it. The other way around would have been that we would have to learn yet another lauguage, one that would undoubtely be more complex to integrate in distributions.
Because in ConTeXt this graphical subsystem is rather tightly integrated it is no surprise that we started a subproject that would make integration more efficient. Improving and extending this program turned library is an ongoing effort. Don't see MetaPost as a universal replacement for drawing packages, just as a pretty good base system to have available in the core of LuaTeX. You can ignore it if you don't like it.
Other closely related projects are the LM and TeXGyre font projects.
Eventually this will lead to a set of opentype fonts (including
math) that can replace the multitude of type1 fonts in TeX
distributions. A naive view on LuaTeX sticks to
just TeX but
extended with Lua or
an opened up TeX using Lua.
However, in practice LuaTeX is just the core of a macro package.
It's the macro package that opens up all the functionality to
users: languages, fonts, math, etc. LuaTeX is a couple of megabytes
in a (potentially) gigabyte whole.
In the past there has been several attempts to upgrade TeX. Examples of stable and accepted extensions are eTeX that provides more primitives, pdfTeX that provides an integrated backend and XeTeX that supports opentype using external libraries. However, opening up subsystems to macro writers eventually might have more impact as it permits us to extend the core functionality of the program without changing the engine. It is for this reason that we decided to come with a new version of ConTeXt that permits drastic redesigns where applicable and that also can explore and use new functionality. Imagine that we'd have to discuss extensions to the engine in a larger perspective: nothing would happen eventually as there are often multiple solutions for a (typographical) problem. Just leaving the extensions to the macro package writers is more efficient.
Bits and pieces of the code that is part of ConTeXt MkIV has a generic character and can be used outside ConTeXt. We will occasionally add generic components to the distribution.
We started by mentioning a few languages and the dependency on the installation of interpreters. A nice side effect of the LuaTeX project is that it removes this dependency as from now TeX distributions ship with a scripting engine, called LuaTeX. This already has some impact on distributions as traditional scripts are being replaced by Lua scripts which makes distributions even more portable.
This project boosted the development by providing grants for coding. In this project Idris Samawi Hamid, Taco Hoekwater and Hans Hagen work together on high end arabic typesetting in the perspective of critical editions. A nice aspect of this project is that it provides a nice stress test for LuaTeX and ConTeXt MkIV. It also clearly demonstrates that the engine is just one of the many factors as much time and effort goes into for instance uncovering the secrets of opentype, creating appropriate fonts, and figuring out methods for fulfilling the ambitious objectives of this project.
Did we make the right choices? We can safely say that if we'd chosen another language, changes are pretty low that ConTeXt would have undergone such a drastic reimplementation. To mention one reason: one has to like a language in order to use it. Also, no matter what other language we'd chosen, there would be those questioning that choice.
Not be neglected is another argument: by following this route we are able to do the job within reasonable time. By adapting ConTeXt we test the mechanisms in practice. By starting from the TeX end we keep compatibility. We're long time users of TeX and tend to make such decisions on a similar long term. It's easy to comment on an ongoing effort but 30 years of TeX has proven that lots of discussion not beforehand leads to useable products.
Just look at it this way: something is happening and the people who do the work also happen to be extensive users who are quite rooted in the TeX community. Only history can prove us wrong.
We have been and will be publishing and talking about the developments on a regular basis. As this happens at user group meetings and in user group journals, members of those groups are most likely to hear the news first. Most usergroups put copies of their journals on the web some time after publishing. Of course there is the LuaTeX manual as ultimate reference available for everyone.