Here are some links for ripping Word Docs apart and storing bits here and there in a sql database to be later retrieved and re-glued.
- c# – How to store formatted snippets of Microsoft Word documents in sql server – Stack Overflow
- How to Store, Read & Delete a Word Document in Database Using .NET code snippet and examples
- Frespire .net tools that might make this easier
- One thing I think that is key is inline metadata (data attached to an Heading 2 or Heading 3 etc.)
- This helps mark up docs TO BE RIPPED APART. Once ripped apart, the data will be stored, later to be cobbled together
- One way to get these inline meta-data keys injected is to use citekeys. Here is a practical how-to in word.
Some similar systems
- Phrase Express
- Pandoc – an open source project to convert one format to another (usually a quite normal->geek level)
- c# code for finding start and end placeholders in word