Last post here: Time to move;

April 10, 2008

This is the last post on this blog, time to move on and get all my stuffs on one blog. From now on, all things on ‘odf4j’ will be available at this link

See you!


Style Tree Implementation

October 5, 2007

/*
* StyleTree.java
*
* Created on 28 Sep, 2007, 2:43:06 PM
*
*
*/

package org.openoffice.odf.style;

/**
*
* @author amitksaha <amitksaha@openoffice.org>
*/
import javax.swing.tree.DefaultMutableTreeNode;
import java.util.Enumeration;

/*
* Implementation of the Style Hierarchy in ODF documents
* EXPERIMENTAL CODE
*/

public class StyleTree {
private DefaultMutableTreeNode top;

public void getStyle(String style_name) {

//Create the nodes.
top = new DefaultMutableTreeNode(”ODT Style Families”);

createNodes(top);
search(style_name);
}

private void createNodes(DefaultMutableTreeNode top) {
DefaultMutableTreeNode style_family = null;
DefaultMutableTreeNode parent_style = null;
DefaultMutableTreeNode style_name_1 = null;
DefaultMutableTreeNode style_name_2 = null;
DefaultMutableTreeNode style_name_3 = null;

style_family = new DefaultMutableTreeNode(”graphic”);
top.add(style_family);

style_family = new DefaultMutableTreeNode(”Paragraph”);
top.add(style_family);

parent_style = new DefaultMutableTreeNode(”standard”);
style_family.add(parent_style);

style_name_1 = new DefaultMutableTreeNode(”index”);
parent_style.add(style_name_1);

style_name_2 = new DefaultMutableTreeNode(”Text_20_body”);
parent_style.add(style_name_2);

style_name_1 = new DefaultMutableTreeNode(”P1″);
style_name_2.add(style_name_1);

style_name_1 = new DefaultMutableTreeNode(”P2″);
style_name_2.add(style_name_1);

style_name_1 = new DefaultMutableTreeNode(”Heading”);
style_name_2.add(style_name_1);

style_name_1 = new DefaultMutableTreeNode(”List”);
style_name_2.add(style_name_1);

style_name_3 = new DefaultMutableTreeNode(”caption”);
parent_style.add(style_name_3);

style_family = new DefaultMutableTreeNode(”table”);
top.add(style_family);

style_family = new DefaultMutableTreeNode(”table_row”);
top.add(style_family);

}

private void search(String item){

Enumeration res = top.depthFirstEnumeration();

for (; res.hasMoreElements() ;) {
Object obj = res.nextElement();

//trivial typecast

DefaultMutableTreeNode node = (DefaultMutableTreeNode)obj;

if(item.equals(node.toString())){

//traverse up the tree for the style information

System.out.println(”Style Information for: ” + item);
System.out.println(”Style Family: ” + node.getParent().getParent().getParent().toString());
System.out.println(”Parent Style: ” + node.getParent().getParent().toString());
System.out.println(”Previous Higher Style Category: ” + node.getParent().toString());

}

}

}

}


Style Hierarchy in ODT documents

September 4, 2007

I have identified a hierarchical relationship among the way Style information is stored in “style.xml” file. Based on my findings I have represented them in a Style tree.
Style Tree

Now, using this Style tree we can retrieve style information associated with each style, like the Parent-style, Style family, etc by using the style-name which are the leaves of the tree. This will save us from parsing the “style.xml” everytime.


A possible approach to Style Handling using Java Beans

August 19, 2007

I shall make an effort to explain how we could possibly work with
Styles using Java Beans.

I understand, the entry point for retrieving the style information for
an element, is the <text:style-name > using which we dig down further
in the “content.xml” <office:automatic-styles>.

What we can do here is that for each element’s style information we
have an associated Java Bean with properly defined properties to
reflect the style information. Being a Java bean we automatically have
getter/setter methods for the style information for each element. This
Java bean will be created automatically when the user makes a call to
get/set the style information methods from the API.


Interfaces based approach to Style Handling

August 19, 2007

Bernd Eilers suggested a interfaces based approach to handle Style information in odf4j:

My approach would be to use something like the following which is almost
completely interface based and hides the real implementation classes as
inner class of a factory making it unavailable for the API to construct
objects without using their factory. Also I would not want to have a
generic Object getProperty(String name) and expose static String
constants for attribute names or something like this as I consider the
list of things to get/set fixed by the specification and thus we only
need specific getter and setter methods and nothing generic. I would
also like to suggest to derive specific Styles from a common base Style
interface for each different Style family.

public interface Style {
        public String getDisplayName();
        public String getStyleFamily();
        ...
        public Style getParentStyle()
}
public interface SectionStyle extends Style {
        public String getBackgroundColor();
        public StyleBackgroundImage getBackgroundImage();
        ...
}
public interface TextStyle extends Style {
}
// note this is a little bit special because ParagraphStyles
// contain the same things as TextStyles plus some more
// and so we can inherit here
public interface ParagraphStyle extends TextStyle {
        public int getBackgroundTransperancy();
        ...
}

// special stuff eg. for subelements of styles gets
// into own interfaces, for example
public interface StyleBackgroundImage {
        public static final String REPEAT_NO="no-repeat";
        ...
        public String getRepeat();

        public void setRepeat(String newRepeat);
        ..
}

and now in Style Factory have an abstract implementation base class as
inner class plus implementation classes derived from it as inner classes

public class StyleFactory {
     public final abstract class StyleImpl implements Style {
         // constructor
         public StyleImpl(Node node)
     }
     public final class TextStyleImpl extends StyleImpl implements
TextStyle {
         ...
     }
     public final class ParagraphStyleImpl extends TextStyleImpl
implements ParagraphStyle {
     ...
     }

}Would like to suggest to do the similar thing than with existing classes
like Element and ElementFactory that is to convert Element, BlockElement
etc to interfaces and have ElementImpl, BlockElementImpl etc. as inner
classes inside the ElementFactory class implement them.

Forgot to mention one main reason why to use that more interface centric
approach with inner classes for implementation.

Later on when we move forward this APIU can be expressed in UNO IDL
while classes with constructors that have java specific org.w3c.dom.Node
elements as arguments can not as easily expressed in UNO IDL.


odf4j Class Diagrams

July 19, 2007

Available for download here (recommended) is the high level package structure reproduced below:

High Level Package structure of odf4j

I created an entry at http://odftoolkit.openoffice.org/servlets/ProjectDocumentList under the “odf4j” project folders to store the UML class diagrams of the “odf4j” project.
The class diagrams can be accessed here


Updates

June 16, 2007

After a fortnight of coding inactivity, I am currently working on Style Handling in ODT documents.

In the meantime,my role has also been upgraded to a developer.


OdtToText2.java - Revision 1.2

May 29, 2007

Added support for “sections” in ODT documents.

View the CVS log here


OdtToText2.java (Initial Version Committed)

May 21, 2007

The initial version of OdtToText2.java was committed to the CVS by Bernd Eilers. View the CVS log here

Features

  1. Extracts text, headings from a ODT file
  2. Uses the classes TextBody, BlockContent, Element, etc, in odf.text
  3. No manual SAX parsing

To Do’s

  1. Extend it to extract other information from a ODT file,like table information and etc.

    Test Run:

    • Input - ODT file containing simple text, a heading and List of elements
    • Output :

    DEBUG unhandled elem is org.openoffice.odf.text.UnknownElement node=office:forms
    DEBUG unhandled elem is org.openoffice.odf.text.UnknownElement node=text:sequence-decls
    He heard quiet steps behind him. That didn’t bode well. Who could be following him this late at night and in this deadbeat part of town? And at this particular moment, just after he pulled off the big time and was making off with the greenbacks. Was there another crook who’d had the same idea, and was now watching him and waiting for a chance to grab the fruit of his labor?

    *  aaa
    *  bbb
    *  ccc
    *  ddd
    *  eee
    *  ffff

    ==== Heading ====
    text below heading
    *  aaa
    *  bbb
    *  ccc
    *  ddd
    *  eee
    *  ffff


    Challenge to be taken care of (Bernd Eilers)

    May 19, 2007

    What´s returned as java.util.List at that textBody.getContent() call is
    in fact an instance of class BlockContent which extends
    java.util.AbstractList. What the listIterator() and iterator() methods
    of that AbstractList returns which are curently not overriden in
    BlockContent would likely call the get(int index) method with an
    advancing index for every next() call on the Iterator. The get(int
    index) method which we have in BlockContent now basically starts every
    time at the first childElement advancing until it gets to the index
    element at each call. Adding that together means that the current
    iterator is highly inefficent especially when considering large
    documents. So this means it would be a good idea to implement some inner
    class which implements ListIterator for BlockContent and to override the
    iterator and listIterator methods to return an instance of that class.
    This inner class should keep an pointer into the DOM tree for
    remembering where it is and just call factory.getElement() similar as it
    is done in the get(int index) Method.