Presenting only the content that readers want to use is a challenge though. Because periodical content is largely unstructured — not available in tables, nor necessarily following any schema – so indexing the information is tough. Most magazines use PRISM, a flavor of xml, to help add structure to content. With so much of the content being math and scientific in nature, AIP is using MathML, semantic analysis and other variants to create metadata. “Once everything is tagged properly, we will use MarkLogic [see disclosure] to index, search and deliver content on demand,” describes Wonder. “The tagging is the tough part.” The end result will mean readers should be able to find snippets of information that reside in 7,500 word documents, “which is precisely what readers want if they are standing in the field looking for an answer,” says Wonder. In both of these cases, content is not so much repurposed — but deconstructed and served, with simplification being key. For other periodicals,...
- David Miller