MohammadTaghi Hajiaghayi,
Department of Computer Science,
University of Waterloo,
mhajiaghayi@math.uwateroo.ca
Available On-line at:
http://www.cs.uwaterloo.ca/~mhajiaghayi/mathml/mathml1.html
Slides
of this presentation
Abstract
Mathematical Markup Language (MathML) is an application for defining and expressing mathematical documents on the WEB, just like HTML which provides this functionality for text documents. Its final goal is to provide a standard notation that other WEB softwares and application are be capable of reading and expressing mathematical formulas very easily.
Because of lack of standard notaitons for representing of mathematics on the web, the current method for defining the mathematical expressions is using the GIF or other types of images. However they are not appropriate at all. Suppose that you want to copy and paste a GIF formula. There would be a lot of problems, because you must copy and paste the actual image file. Also, the problem arises in the case that you want to change the formula somewhat. However, using the MathML you are be able to copy and paste or change the mathematical formula like the other HTML text. Another property of MathML is that, people with visual disabilities can also understand the mathematical formula, because the MathML documents can be easily converted into alternative media such as speech or Braille. Also MathML provides a more efficient method for transferring the mathematics formulas on the web, since sending the Image files instead of formulas is not efficient mainly because of the large size of them.
At present, W3C which is the main sponsor of MathML has released three recommendation of MathML. One is W3C MathML 1.0 Recommendation released on 7th April 1998. Another is MathML 1.01 Recommendation which was release on 7 July 1999 [1] and the other is Proposed Recommendation 2.0 published on 8 January 2001.
Now, several vendors offer plug-ins appletes which can render the MathML encoding. However still there are a few translators and equation editors which are capable of generating HTML pages with embedded MathML codes. So far, W3C's Amaya browser supports MathML as does E-Lite by ICEsoft. Netscape's new unreleased browser, titled Gecko, is planning on fully supporting MathML, and maybe including a WYSIWYG editor for mathematical equations [3]. It is hoped that MathML will be also available through the use of plug-ins for browsers that can't fully implement it yet.
However, many organizations have expressed support for MathML. Some of them are as follow: IBM, makers of the Techexplorer Scientific Browser; Wolfram Research, makers of Mathematica 3.0; Waterloo Maple, makers of Maple V; Hewlett-Packard, makers of the EzMath plug-in; the American Mathematical Society, developers of a LaTeX to MathML translator; Design Science, makers of the MathType equation editor; and yours truly, Design Science, makers of the WebEQ suite of MathML tools [3]. Also there are some conferences and workshops e.g. MathML International Conference 2000 [4] and some discussion sites e.g [7] concerning MathML. Also, there are so many brief papers about introduction or features of MathML such as [5, 6, 8]
2. MathML Overview
2. 1. Presentation and Content
MathML supports both of these content encoding and presentation encoding and you can easily chose each of them in different situation depending upon which of them is easier and more appropriate for the expressing the formula. Even you can mix this two encoding and use the hybrid of both of them.
In MathML, there are about 28 MathML elements, with about 50 attributes which are used for presentation encoding [1]. Most of these elements are used for representing templates or patterns for laying out subexpressions. For an example mfrac is an element for expressing a fraction of two subexpressions by putting one over the other with a line in between. When you use elements like this, renderer or other MathML softwares generate or print it in an appropriate form without mentioning any futher information concerning the special software or hardware media, however, still you have to mention the tickness of the line or the length of the line by adding some attributes (the default value will be used if you don't mention them explicitly). So the presentation encoding provide a middle level for expressing the mathematical formulas.
For content encoding there are about 75 elements with about only 12 attributes [1]. As you can see the number of attributes here is so smaller than the number of presentation attributes. The main reason is that in content encoding we rarely deal with attributes concerning representation. Most of these elements are used for representing the mathematical operations and functions, such as plus and sin or other mathematical concepts like set and vector. These elements provide a high level way for expressing formulas.
In MathML, presentation and content elements provide a rich set of means for representing the mathematical expressions without dealing with very low level attributes such as media-dependent attributes.
Now, let us consider an example. Suppose you want to show this expression:
(a * b)^2
The presentation encoding for this formula is as follows:
Now, let us show the same expression using content encoding:
As you see the structure of this example is very similar to
the structure of presentation encoding, however the elements used are content
elements instead of presentation elements. Each subexpression is
enclosed between <apply> and </apply> tags and
also it is represented in prefix form. This form of representation
is common in Content encoding, because MathML content encoding considers
all operations as function with some operands. Also, we have used
<ci>
and <cn> tags instead of <mi> and <mn> tags.
In addition, we can mix the content and presention encoding and get
a hybrid encoding like below example for above experssion.
Most of expressions in MathML are nested and the best way
for expressing these nested formula and having a good idea in our mind
specially when the formula is complicated is expression trees. In these
expression trees each node represents an operation or a particular layout
schema, and its children represent the subexpression. This description
is not only used for bettter understing of expressions but also used for
knowing how the MathML tags should be nested on the screen or on
the printer. We discuss more about this issue after introducing the layout
boxes.
You can see one mathematical notation and its tree structure cited from
[3] in figure below:
In fact, layout boxes are corresponded to nodes of expression tree, and these two models i.e. expression tree and layout boxes provide an abstract model for representing, evaluating and rendering the mathematical formulas. This abstract model is not only used for people to better understanding the expressions, but also used for renderer to find the size and dimensions of these boxes and provides an abstract model for evaluating of a expression especially in system algebra softwares.
It is worth mentioning that in spite of that simple layout boxes are the smallest substructures of MathML presentation encoding, however they are still media independent and they are middle level models.
You can see the layout boxes and it's corresponding nodes in the above
figure. For an example, if you want to evaluate the dimension of the root
of this tree first, you must compute the dimension of the simple layout
box 3, then you must compute the size of the complicated layout box (x+
2) recursively and finally consider an attribute of "thickness" or
"length" of the fraction line. Using all these information you can
compute the size of the whole box of 3/(x+ 2).
The style of mark-up elements in MathML is very similar to
that of HTML, however, because the number of notation in the mathematical
formula is more than the number of notation in regular text, naturally
the number of start tag/end tag is very high in MathML. The syntax of using
attributes for tags is also similar to HTML.
<mfenced>
1:
Sets (Generic form: <set> [<elt1> <elt2> ... | <condition>]
</set>)
A set represent a set of data which all must be from the same type.
There are two ways for showing a set: specify the elements of a set explicitly
or only say some conditions like "all real y such that 1< y <
2". This condition can be represented by interval container elements
introduced below.
2: Intervals (Generic form: <interval> <pt1> <pt2> </interval>)
These elements were the the main container elements of MathML, however still there are other kinds of container elements such as list which are introduced in [1, 2].
4.2. Operators and Functions
For an example consider the expression below:
You may find a wide variety of examples of this kind in references [1, 2, 3].
In this presentation we have introduced the MathML and briefly discuss
about main points of it. We said that MathML is very good standard for
using and showing mathematics in the WEB. But, the problem is that MathML
codes are not so readable, and so we must use other software's and
interfaces to create, manipulate and show them. Now there is a question:
It wouldn't be better if we use an alternative already exists in some software's
such as Mathematical or Maple, for example we use only a
start tag like <mathematics use=Mathematica> and an end tag</mathematics>
in HTML (with possible other values, like Maple). This may be expensive
in the short run (because of copyright), but maybe much cheaper and beneficial
in the longer term. The problem of copyright is also not so important,
because, after developing some other interfaces for MathML, they will be
not necessarily free and so this problem will be appeared again. In addition,
using of this software's will increase the portability of mathematics in
the web and in the computing software area. In the mean time, the
MathML proposal as in the draft paper can be published, but not as a new
standard but as a contribution to the discussion.
I believe that the answer of the above question is 'no', and my reasons are mainly those which I have mentioned in the Introduction section, but I think this question must be answered deeper. You can also find futher discussion about this issue in [7].
6. References:
[1]: Mathematical Markup Language (MathML[tm]) 1.01 Specification WIC Recommendation, revision of 7 July 1999
[2]: Mathematical Markup Language (MathML) Version 2.0 W3C Proposed Recommendation 08 January 2001
[3]: Gentle Introduction to MathML
[4]: MathML International Conference 2000
[5]: The Interchange of Mathematics in XML: MathML, OpenMath and their Application
[6]: MathML - What's in it for us?
[7]: The Disappointment and Embarrassment of MathML - update: Including Reactions and Answers
[8]: Putting Mathematical Notation on the Web