Round Trip Textile
Posted in Kiloblog, Professional, Software Development | Leave a Comment
I’m all about Textile.
I’ve installed Textile 2.0 on every WordPress installation, and I regularly use Backpacks and Instiki in my work, in recovery, in New Orleans. I’ve created lesson plans around Textile, and taught neighborhood leaders to use Textile, instead of web-based WYSIWYG.
Why? Because Textile works. It’s easier to teach than a flakey JavaScript editor. Once they get it, they use it, and there is little followup.
I’m a fan.
Thus, I’m creating a Textile parse for Java, which is my language of choice. There are a few available already, I know. My needs are a tad different.
It follows that my implementation is a tad different as well. Rather than processing a string in place, I’m going line by line, building the paragraph blocks, and then processing the contents recursively.
I’m finding that there are some serious differences between the Perl and PHP implentations of Textile, and the Ruby Redcloth implementation.
Thus, I’m curious as to whether or not we can standardize on a definition of Textile.
I’m also finding that all of the implementations will produce invalid XML if fed the right sort of garbage. Because I’m emitting SAX events, that will not work for me. I’d like to have bad Textile, still produce valid XML.
I’m particularly interested, because I’d like to create a round trip implementation of Textile. One where, if you can go from Textile to XML, then you can go back to Textile, so that I can store the Textile as XML.
This would be a subset of XHTML, and it would be limited. Only that XHTML which could have been produced by Textile would be acceptable.
In order to do this though, it would be nice to start from a definition of the Textile languagae, a BNF Textile grammar.
Is there interest in creating a definition of the Textile language?
4 Responses to “Round Trip Textile”
I would love to work on a standardized Textile. The issue is that the Textile core really belongs to “Dean Allen”:http://www.textism.com/.
Most Textile implementations offer substantial extensions (some that don’t necessarily belong in a markup language [Google queries for example]) which would sometimes be contrary to the inherent simplicity of formatting the language was designed to create. Language standardization is going to be a long road on this one, because with each iteration the various implementations are straying further from Dean’s simple core language. Typical Programmer vs. Designer type stuff.
If you would like to start conspiring on getting a more XHTML and XML safe variation of Textile2 going, I’m all for it. Drop me a line or look me up on IM.
P.S. I updated “my Textile plugin”:http://idly.org/2007/03/22/textile-21/ last month. It plays nice with WP 2.1/2.2 and PHP5 again — and has a nice control panel to boot.
When I posted this I sent a message to Dean Allen, Brad Chote, and Why the Lucky Stiff. No response. We can keep trying them. People get busy with other things.
On my list of things to do is to try your latest Textile plugin. I use Textile because it forces people to think when they cut and paste form Word or from Yahoo! Mail. If they put their faith in the WYSIWYG editor that ships with WordPress, then they are sending me email asking what I did to break their website.
I’d like to take the designer approach, see if it makes sense to define an EBNF grammar, maybe using a tool like ANTLR. It would be nice to have a parser, rather than the sequence of regular expressions that define current implementations. It would be easier to port and stay consistent.
While a compiler tools speeds research, I would like to focus on a JavaScript implementation of Textile. It would be very nifty to have an immediate translation, with no trips to the server, except the final trip that submits XML rather than textile.
Sorry the response took a few days… I had to ponder (and research EBNF and ANTLR a bit).
I’d say a grammar is the first step toward any and all of these goals (although I’m not sure if there is a parser available in javascript for any of them [and my javascript skills would need some major tuning up in order to implement one]). If you have some experience writing something like this, I’d bow to your experience and assist in any way I can.
As for the javascript implementation, do you intend to create a new namespace for Textile transmission, or shall we just use XHTML and create a parser to do XHTML -> Textile conversions on the return leg for editing?
BTW, I’m quite frequently on “IM”:http://idly.org/elsewhere/ if you want to toss ideas around.
No experience with parsers or compiler compilers like ANTLR. I have used ANTLR to create some simple path language grammars. A path language that describing a path in XML. That is all.
My experience with ANTLR was that it was akin to writing a regular expression. Once I got my self into that mindset, it didn’t seem altogether foreign.
There is no such thing as a JavaScript equivalent to ANTLR. One might look at the ANTLR output as a guide. It might be more likely that with a grammar that is known to work in Java, you’d simply write a parser by hand in JavaScript, from scratch, confident that you’d resolved problems with the grammar.
By problems with grammar, I mean things like “ambiguities”, where a series of tokens can match more than one “production”. See? I’ve reached the outer limits of my understanding of the world of computer language design with that sentence. But, maybe someone will correct me if I’m wrong, that you can develop that you have a grammar that makes sense using a tool like ANTLR, than by writing it out by hand with unit tests.
XHTML to Textile, the idea would be that you’d look at XHTML and convert it back to Textile. Assuming that the XHTML was produced by Textile. So, yeah, that’s what I meant by round trip.
When I use Textile, I strive to stay semantic in my markup, so the output XHTML is very basic. It seems that I could write an XSLT transform that would turn it right back into Textile. I’d rather implement it not as a transform, but as an iterator over DOM, but that gave me the notion.
So, no, not a special Textile namespace. If I understand correctly, but a best guess as to turning XHTML back into Textile.
Leave a Reply