Version Control for Documentation? 296
CodeNation asks: "I'm a coder in smallish (~50 staff) company with a ~20 strong development team. We, the development team, have been using CVS, and CVSWeb to manage our source base for a couple of years. In the meantime, our corporate documentation has become a complete mess. By 'corporate documentation' I mean content such as Word documents, Powerpoint presenations, and Excel spreadsheets. Anyway, I was recently asked by the one of the bosses to put a document version control system in place for this corporate documentation. All this, and the system has to be usable by the non-technical." Ask Slashdot has touched on a similar topic but it's been about 2 years since that article. Has there been any headway in this area?
"Now, this would be a trivial task if:
- The documents were text-based (i.e. the file formats weren't binary)
- The entire company understood how to use CVS
However, neither of the above are true.
I took a look at CVSWebEdit, but unfortunately it's not quite there yet in terms of stability and usability.
Does anyone have any suggestions for a possible solution? What are you currently using for document control (remember these are Microsoft Office documents). Also note that although the developement team works on Linux boxen, the non-technical staff works in a Windows environment.
Thanks for your help."
Re:"Good Developers" can just slap on a front-end. (Score:1)
From an old beard (Score:1)
Look at "Doors" (Score:1)
DocuShare from Xerox (Score:2)
Well, when it actually becomes usable (Score:1)
Re:As much as I hate to say it... (Score:2)
Microsoft Visual Source Safe.
Unlike everyone else who replied to you, I'm going to say more than "SourceSafe sucks ass!"
I work at a windows shop, we use SourceSafe to store all our code (VB, C++, HTML & ASP stuff), and also documentation. For code, it's fine. (I mean, it's slow and ugly and shit, but it works fine. People who have worked there for many years say its never "eaten" anyone's file).
But we do also store Word & Excel files in there (and Access mdbs), and for that, well, it's pretty pointless. Because I'm assuming that the asker wants to be able to do more than just have the latest version available - if that was all he wanted, the files would be sitting on a fileserver and he wouldn't be asking Slashdot. If you want to be able to do diffs, see WHAT was changed in a checkin, there's no simple way to do it with binary files like Word and Excel produce. Go on, diff them with SourceSafe - "binary files differ" is all it will tell you.
There may be a solution (hell, doesn't Word store changes itself? haven't people gotten into trouble by publically releasing documents with old text "hidden" in them?) but SourceSafe isn't it.
Re:CVSGui (WinCVS) (Score:1)
jCVS [jcvs.org] is also a decent graphical CVS front end and based on a cursory look at CVSGui/WinCVS, it runs on far more platforms and provides a consistent interface across them (it's written in Java, so it should work on pretty much anything with a good VM and set of class libraries). I used to use it all the time on Windows and now I use it all the time on Linux, so I know it works well on at least those two platforms. I think you can also set it up as a servlet so that users can access it from a web browser.
This isn't to say that CVSGui isn't a good program, I'm just pointing out another UI option.
Re:Binary File Version Control - problems with it (Score:2)
My solution: LaTeX for me, PDF's for the clueless. (Score:2)
However, I write all my documentation in LaTeX, using encapsulated PostScript for diagrams. This isn't a problem, because I can generate a PDF which is stored in CVS along with the text source files. I do this for software internal documentation as well as for presentation materials (slides).
Binary files are okay in CVS as long as they are derived objects rather than primary objects. What happens during a merge is that you merge the primary objects (LaTeX source files), and regenerate the derived binary object (PDF) from the primary sources and commit the results. However, when the primary objects are non-mergeable (e.g Word documents) you are in trouble. If two people modify the same Word document on separate branches, there is no way to merge other than by hand editing.
With LaTeX, the documentation can be broken into many files which are stitched together with \input directives. This was an advantage because at one point I was working in a small team on the same document. Using multiple files minimizes the amount of back and forth merging that takes place with concurrent development.
I have a few LaTeX macros which can be used to embed CVS keywords like $Id$ in a special way into the documents. The macro actually collects these into a file during latex processing. A special appendix is then added to the output which lists all of the source files and their versions. This is called the ``Bill of Materials'' and tells the reader exactly what versions of the files in CVS correspond to the hard copy.
Here is a good advocacy page you can throw at people who think that word processors are actually suitable tools for intelligent documenting: Word Processors: Inefficient and Stupid [wfu.edu].
Re:CVSGui (WinCVS) (Score:2)
Re:As much as I hate to say it... (Score:2)
Xanadu of course. (Score:3)
Joseph Elwell.
Use CVS and LaTeX - seriously (Score:2)
LaTeX is easy - there's windows environments for it. Even without it, a trained chimpanse should be able to write \chapter{Title} to start new chapters. For non-mathematical writing, that's about all there is to LaTeX. Maybe \textbf{bold} too, but then it's about covered. Argumenting it is too hard is going to take better arguments than "remembering a small handfull of words (so-called codes, even though they're often real words) is too hard".
And CVS ? Well, Cygwin (from sources.redhat.com) has CVS and it works beautifully. Don't tell me that writing "cvs update" is harder than locaing the correct share on a server. Just configure CVS for the non-technical people so they don't have to worry about CVSROOT.
Resolving CVS conflicts ? Anyone who's had their word document changes reversed because someone else corrected a spelling error in their document while they were working on it is going to be really encouraged to learn about conflicts. Usually though, conflicts are easy to resolve in non-code (documents).
If you have to argue against this, please make up a sane argument - saying "but people only know word" is not an argument. With this solution they don't even *have* to know word.
XML (Score:2)
Re:As much as I hate to say it... (Score:2)
I think you're missing the point (Score:5)
* Current Version
* Related Categories
* Awaiting Approvals
* Approvals Received
* Place in the system
* What time to re-review
* Notes for a document
This way, anyone can easily find a document, and find any past versions. New documents will have to go through a formal approval process, and people will be automatically notified when documents need to be re-reviewed. Notes can be attached for clarification and questions.
CVS can do versioning, but not the rest. And, CVS's versioning is MUCH too complex for what you need. You don't need branches/tags/everything else with corporate documents. You are never going to merge between branches. You just have the document and its version.
Wrong Person (Score:2)
Documentation folks are used to a different work-flow model. Traditionially there are copy writers, graphic artists, layout folks, and layers of editors. Their workflow is folders, or in the computer-age their representation, directories.
They're analogous models but there are differences. In writing & graphics the latest version is not always the best. Content, as you've learned, can come from many sources and in different formats. Materials are used & reused in bewildering permutations. While automaton systems exist (and are at least as sophisticated as CVSs) they require much more customization.
Frankly you're the wrong person for the job.
While you may have been great at organizing materials and workflow in your field you're now outside of it. It's a different culture (albiet possibly dysfunctionial in this case) and for you it's terra incognita. The tools are different, the processes are different, the constraints are different, the end products are different.
The best strategy would be to bring in someone knowledgable in documentation to evaluate the current staff & systems and make some suggestions, possibly implement them. If things have gotten to be "a complete mess" then there's undoubtably a number of problems and deus ex machina won't fix them.
The problem could well be poor organization. It could as well be unqualified staff, understaffing, poor management, lack of support from other parts of the company, bad technology choices, etc.
You're not really qualified to judge any of these, good as you may be in your area and in spite of the general problem-solving and management skills you might have. It's as if one of these folks were asked to come in and help shape things up when the R&D folks are late writing a release.
Indeed if it were me (and I've been invited to undertake similar projects in my career) I'd decline this "opportunity". What you're being asked to do has nothing to do with your competencies. Indeed if the only point is congruence is familiarity with CVS then the bosses would be better suited - they do more general writing and live more in the world of memos & Powerpoint slides.
You were hired to code. Presumably you're good at that. You weren't hired to analyze business processes, research & specify documentation automation systems nor to implement them. Finally, if you do this you'll be the guy for this in the future, any issues regarding it will end up on your desk from now on. Is that where you want your job to go?
Beware of the $$$ packages... try BSCW (Score:2)
I would suggest you take a look at BSCW [bscw.gmd.de].
Features: web interface, users/groups, upload, drag and drop, email notifications, revisions, workspaces, delegation, etc....
I put up BSCW at a past job with a team about the same size as yours. It was okay but making sure people could get to it was the main issue. i.e. this was back when getting folks to use a web browser was a chore ;)
You can tweek it a great deal and there are drag and drop "goodies" utilities for the form upload challenged.
-Jay
Visual SourceSafe (Score:2)
Since you've stated that the droids and the techophiles don't use the same OSes, it almost makes sense for the droids to be using a MS centric VC application. It probably handles Office files natively and has a pretty interface.
On the downside, you'll pay mucho diniro for it and you'll need an NT server for it to run on (speculation on my part).
On the other hand, CVS handles binary items just fine, you just can't include RCS tags in them. If you have a nice web based wrapper, then CVS would work fine, cause you can show log items independently through the interface. Browser upload and DL would solve the checkin/out problems.
May the force be with you.
What's your budget? (Score:2)
What you're missing here is the ability to browse versions via cvsweb and the like, and do visual diffs on them, which are beyond the scope of CVS seeing as they'd involve DOC- or XLS- to-HTML conversion, among other things.
This is usually why a company or organization invests in a commercial DMS like Documentum, or in a commercial source-control system, depending on how your needs skew. Doing that stuff and making it work well and quickly while still easy to use is not simple, and doing a good job of all the format conversion and parsing requires the commercial conversion filters that Inso and Adobe have owned over the years, and which are in nearly every product that does good conversions of MS Office documents.
That said, you may be able to assemble something out of the following, for example:
Best to use Adobe Acrobat for handouts (Score:2)
Plus, all the change tracking in Word goes away (I won't call it version control, because it's not).
Actually, there are some nasty bugs associated with change tracking (screws up automated numbering of figures), so it's another feature of Word I never use.
Jon Acheson
You can use a network directory if you do it right (Score:3)
I don't like using CVS with nontechnical people who haven't been trrained for it. They are as likely to accidentally overwrite the new version of the doc with the old as they are to save themselves any trouble. So, if you go with CVS, definitely factor in training for everyone.
But, if that won't fly, maybe you should just consider using a plain old network directory, with some careful setup.
Basically, use grouping and network access privileges to give each workgroup their own directory noone else can edit. Each software project has their own subdirectory, and each set of docs for a version of the software a deeper subdirectory. Keep the different versions of the docs totally separate (don't try to be clever and have a shared graphics directory), and archive the old docs in zip files so that nothing gets lost or written over.
That's a start. There is a whole lot more involved in doing this job right: you have to build standardized templates, get the users to enter information for each doc (this can be done easily using a macro that makes them fill out a form when the doc is created), collect that information and set up search tools and indexes.
It's a complicated subject, and that's the short version. If you want to read more, check out my my 16-page Guide to Standardizing and Organizing Documentation [fast.net]. It's not quite where I want it to be yet (second draft but not finished), but I think it might be helpful.
Jon Acheson
Re:As much as I hate to say it... (Score:2)
Re:As much as I hate to say it... (Score:2)
Actual MSs product is crap. There is an add-on, though, called SourceOffsite, which is much, much better see http://www.sourcegear.com/
While generally I just ignore such claims on Slashdot, this just takes the cake and pushes me to respond. SourceOffsite (I use it every day so I'm pretty aware) is basically SourceSafe Lite : All the same features used in the same way, just less of them. How in the world is that so much better?
The good thing about SourceOffsite is that it does have a server component for low bandwidth connections, but without that requirement it'd be enormously stupid to use it in an office.
some nifty scripts to disguise cvs as file sharing (Score:2)
So when they Mount the networked volume, what really happens is a script does a cvs checkout for them, then they see the mounted drive locally. Then they save, folder action converts
You can do a commit after that, or when they unmount the volume...a script does a checkin/commit. Season to taste.
So that's it....basically, you use scripts to abstract the user from CVS. The happy beneifit is by using text formats...you clean all the macro virii out of your data.
teach them cvs or... (Score:2)
The third option is to try PVCS which is another form of CVS / SourceSafe, but made by a diffeerent company. It has a GUI just like SourceSafe and handles binaries.
We just use a fileserver and people edit from there. You can set up word docs to keep track of who did what edits and change colors. There is an option in it as well.
I don't want a lot, I just want it all!
Flame away, I have a hose!
Several Solutions (Score:3)
There are several possible solutions to this problem, as I see it. If you want to go the commercial route, Visual Source Safe is the normal standby, or any other commercial source code repository systems would work. VSS has poor merging capabilities, but when you're dealing with Word files or PowerPoint slides, that's not so much a problem (as Office can "version" the files for you -- though make sure to strip the file before releasing it to customers). Also, the upcoming Office XP has Sharepoint capabilities, and is very easy to use (easy enough for your pointy-haired boss to figure out, even).
On the other hand, you could setup CVS anyway, even though you said you'd rather not. There are a few nice win32 GUIs for CVS, so the end-user experience shouldn't be too bad. On top of that, if you use CVS for your documentation, you can keep that in your source tree. One less tree to manage, and you'll always know where the documentation is. And CVS can handle binary files just fine. Just don't expect to be able to merge changes.
Of course, another (less likely) option would be to move all your documents to HTML, or XML with a set of defined XSLs to transform them. That way, you could do merging just fine. However, that's most likely not a workable solution if you deal with anything but Word documents (since word can save to HTML instead of .doc).
SilentOne (Score:2)
[Plug]
I briefly worked for a company in New Zealand called SilentOne [silentone.com] on their eponymous KM product. Not only were those kiwis awesome to work with, their product was very cool. While it ran on Windows only, it was a very good solution when compared with some of the others out there (such as FileNet and Lotus/Domino).
Obviously, by using Linux, this product isn't useful to you, but I found it very intuitive, and it's integration into the Office toolbar very helpful (just like the CVS commands built into XEmacs). It had a web-interface which could embed Office apps into it, as well as the interface within office.
I'd reccomend it to anyone here stuck with a Windows-only shop (and boy, there are way too many of those...)
[end plug]
Binary File Version Control - problems with it (Score:5)
Some issues:
One - Binary file formats - such as MS Word - cannot have diffs run on them that are meaningful - a binary diff will result in a file larger than the two original files. So most version control programs will store binary files as seperate versions - but will NOT show the differences between the versions.
Two - Microsoft HAS some built in support for versions within Word - however this will quickly result in VERY large files - which get increasingly less useable. Also note, that this will be ONE file containing all versions - if you "version" this file, you will have TWO files with different sets of the underlying versions.. sounds confusing? It would be.
Three - There ARE version control programs that have worked with Microsoft to learn how to understand the underlying MS file formats (I believe Clearcase may, possibly MS's own Sourcesafe (which is otherwise a dangerous version control software to use since it can have data integrity problems) and possibly a few others.
So - what would I recommend or suggest?
First - Look carefully at WHAT you are intending to version. Is it a collection of documents (i.e. a full manual)? Individual documents that change over time? A whole project structure (say a website for online help?) Or something else?
Can you seperate out the FORMATING (which might be in MS Word) from the content? For example by using a Master document format - importing TEXT documents into MS Word? This would allow great flexibility in versioning the underlying text documents, keep a smaller MS Word file, and that file could be "versioned" storing copies of each successive version?
Second - For simple document management systems, (which run on Linux but can be accessed by any browser) look at a system like InfoPlace - simple, open source (I think) and easy to use. It is however not a rigorous version control system, but a partial version control system.
Hope this is helpful - I spent 2+ years teaching and managing version control for a very large development operation (1000+ developers worldwide).
Shannon
CVSGui (WinCVS) (Score:2)
There are GUIs for CVS for many operating systems (including Windows and MacOS) at cvsgui.org [cvsgui.org].
Non-technical people can be teached how to use 'add', 'update' and 'commit' in 15 minutes, at least it has worked in my company. (They use WinCVS for exchanging Excel documents with external clients and are happy with it.)
Re:As much as I hate to say it... (Score:2)
Just as another voice (and remember that these stories aren't statistics), I also know of a VB shop (a very talented one that took good jobs within the scope of VB and delivered good apps, so I assume they knew what they were doing) that had their VSS bit the big one. When they restored from backup, it did it again. And again. Something had broken deep inside and caused a logic bomb that always eventally exploded. They eventually checked *everything* out of an old version, killed the system and rebuilt, but it cost the entire company (all sales documents, templates for various company functions, etc. were in there) a couple of weeks, and got them some upset clients.
But, otoh, they are still using it...
--
Evan
Sigh... groupware (Score:2)
Check out Domino.doc [lotus.com], which is a super-enterprise-grade document collaboration platform that does versioning, archiving, searching, approval and workflow, etc. Not sure if it runs on Linux, but the core of Domino certainly does.
And, yes, I know that nobody wants to admit it, but this is really the area where the MS Exchange + Office platform excels (ooh, accidental pun). Where Lotus needs an add-on (.DOC), much of this searching/versioning is built into the core MS software. Of course, then you have to start administering an NT server... --JRZ
Re:PVCS (Score:2)
Chaos to the rescue (Score:5)
-B
Re:Ultra simple CVS client (Score:3)
Greenstone (open source) (Score:2)
"Greenstone is a complete digital library creation, management, and distribution package for Unix or Windows. Users create collections by gathering a set of input documents, specifying a configuration file, and running the build script. It provides full-text and fielded searching, browsable indexes, customised formatting, metadata extraction (acronyms, languages, etc), a Z39.50 client, and many other features. It supports many input formats, the interface is configurable and multi-lingual, and collections can be distributed on the web or on CD-ROM."
Shameless plug (Score:3)
It seems that they're marketing it now as "a highly scalable and comprehensive collaborative environment for the development of Web-based intranets, extranets and e-business applications." Oh dear.
Look into StarTeam (Score:3)
As a developer, I usually prefer CVS, but StarTeam works quite well for a whole office, Word docs and all. For the Windows-based world you mention, it seems quite appropriate. They have many different clients, and I've seen it used in mixed Windows & Solaris & Linux environments.
In general, if a shop can't use CVS, and especially if they're using SourceSafe, I can in good conscience recommend it. And remember, friends don't let friends use SourceSafe :-)
IANAL, YMMV, etc. I'm not sure if it will work for you, but it's definitely worth investigating
Use a Wiki (Score:2)
I searched, but no-one else seems to have suggested this. Use a wiki. I recommend MoinMoin [sourceforge.net].
It is usable by non-technical users (a *lot* simpler than MS Word, certainly), and keeps a complete revision history, and can show diffs. Can be installed on NT with IIS5 with full functionality.
The down side is converting existing documentation to use it, although most solutions to your problem will involve that to some degree.
-Spiv.
Re:Visual SourceSafe (Score:2)
I would recommend you use anything except VSS. Regardless of your OS religion, userland preferences, editor choices, or revision control system, your goal is to be able to track change. The usefulness of your system is null or negative if said system cannot be relied upon when you want to revert to an older version. Relying on VSS is tantamount to "backing up" by dump(8)-ing to
Re:"Good Developers" can just slap on a front-end. (Score:3)
True... and the problem with the files being binary can be solved by UUencoding them, first.
Subversion (Score:3)
Not a big problem (Score:2)
RTF can easily be handled by CVS, so there would be no problem there. If I remember correctly, Word supports embeddable data fields, and this can be used to synchronize document versions with software versions, as well as handling the various scheduling and life-cycle data that the docs folks need. For Excel, this data can be stored in standardized grid locations that are outside the displayed or printed boundaries defined. I don't know about PowerPoint.
If the docs department has not already done so, they would have to make use of "master documents" to control the full publication, so each writer, illustrator, etc., is working with a relatively small file that covers no more than one chapter or section of a document. This is the only way that file locking will work successfully.
One person in the docs department would have to learn WinCVS, or some other GUI front-end, to be able to resolve the occasional problem (or they can just contact your CVS person). There would also have to be a few standards imposed, such as closing a file when leaving for the day. But these are very easy to implement. You could even use VBA to force a save and file close after x hours of inactivity and generate a message or email to the docs writer to explain what happened.
I strongly advise against using a separate version control system for software devel and tech docs, since this would only complicate coordinating the right software versions with the corresponding tech docs versions. If you make it transparent, then each tech docs professional is doing things the way that they know how, but the underlying mechanism is doing what needs to be done to implement proper version control, and synchronization with software versions.
If your company is on a tight budget, this is the way to go. Not only can you resolve your problems with a minimum of expenditure (a day or two of programming time), but you can also sell your VBA code to other companies with the same problem (or to one of the companies that specializes in such things), or donate it to become part of the CVS repertoire.
Web+Samba+Scripting language+DB (Score:2)
Among the features: search criteria for many attributes, including indexed text-lookups (by using the unix "strings" tool). There were two modes.. One was "browser-mode", who could only search for "released" versions of documentation and click on a link to download the document. The other was "editor-mode", which required a secured login. From here you could browse any version: draft, approved and or released. At which point, you could "check-out" a document, which would lock it under your name and save a copy in your work-space, which is on a samba/nfs-exported directory (Your X drive in windows). It was su'd to your name so that only you could edit it. Documents were categorized into classes, subclasses, and document-types (product/project/quality.. Then product-name, project-number. And finally, doc-type which was technical spec (TS), functional spec (FS), requirements (RQ), etc..). Each document was given an auto-incrementing document number for unique identification.. So a document name was "DCMS_FS_00079.doc". Managers controled the doc-categories and creation of new documents.. It became a full time position, but FDA requirements are pretty strict.
When you were all done, you used the web to check it back in, which created a new daft version and allowed others to check it out.
authors would submit requests for review of documentation to manager-users, who would ultimately "approve" and later "release" a given version (making it available for public viewing).
Our first version used "clear-case" as the version control (which allowed direct URL-links to files, instead of explicit extraction from RCS), and as the meta-data storage repository (since it had it's own searching engines). We made heavy use of Perl and at the time Netscape-server.
I was in the process of rev-2.0 which would use Apache, Perl, RCS and MySQL and ideally be open-sourced, but our company took a major turn which seized all projects. However, DCMS is still fully running today (under different management), and aside from the performance issues of the clear-case dependencies, works great.
If I had the time, and if I were to do it today, I'd use postgres (for roll-back support), flat files (no real point to RCS if you let the DB do everything), Apache and maybe something like zope with python (though I'm still partial to perl).
-Michael
Re:All you problems can be fixed with a CVS system (Score:2)
I thought I had made peace with the FreeBSD users' group after all these years. Now you tell me that I have to start the flame war anew. You must be one of those twisted vi and tcsh users!
Commercial DM Products (Score:5)
Check out the products available from Hummingbird [hummingbird.com], Documentum [documentum.com], and Eastman [eastmansoftware.com]. A long list of document management vendors lives here [google.com].
---
Re:Chaos to the rescue (Score:2)
Re:As much as I hate to say it... (Score:2)
Since all of your stuff is M$ office stuff, use Exchange server. It will keep track of who made edits to files, when they made the edits, and will only allow one user to make edits at a time. Combine that with or have that indepent of SourceSafe and as long as you backup your environment with regularity, you're golden.
We use both without our company (yea, small dev firm that uses microsoft products), but they work well and we have yet to have any problems.
I see that people on this thread have claimed that SouceSafe has "eaten" some of their documents... can you honestly believe the word of someone that uses the technical term of a file being "eaten"?
uphill battle (Score:5)
Maybe I'm cynical but your stated goals of
implementing version control and
making it usable by nontechnical people
You face one major uphill battle.
Many nontechnical people have a hard time understanding a hierarchy, or of file types; this is expressly why Windows 95 defaulted to hiding file extensions and the subdirectory trees.
Add to that the complexity of "where in the hierarchy does this file permanently belong," and the question "at what point in time was the file in a condition you liked?," you get into a major learning curve. Describing a sandbox is a task unto itself. Undisciplined developers often grok CVS but still don't use the delta comments in any meaningful way.
That said, VMS is probably your ideal here for simplifying version management. Too bad it was an integration into the filesystem itself, and didn't expressly deal with multiple writers or delta comments.
For those who haven't used VMS, the filename included a version number: name.extension;version . If you neglected to mention the version number in a system call, it assumed the newest. Every file opened for writing got the next version number and left the old versions untouched; every file opened for read-write cloned the newest old version first and bumped its version number. This builds into a large list of ;1 ;2 ;3 ;4 ;5 ... ;632 version for each file. You could easily back them all up, or prune to the newest version.
Re:HTML is your friend. (Score:2)
SharePoint (Score:2)
Basic features are:
Search Engine
Subscriptions
Categories
Document Versioning
Collaboration
IFilter support (to support more that Office docs)
There's a trial version available to test it out.
ÕÕ
Re:As much as I hate to say it... (Score:3)
Doesn't say alot for the product.
The other thing is SourceSafe has problems with more than 200 projects in its repository. It starts Corrupting the data. Not what you want to see from a source Control system.
----
Just remove the spaces and do the intelligent thing to email me.
There are... (Score:2)
Version control without education = wasted $$ (Score:2)
1) An idiot won't be able to use it properly
2) Having the technology available will only make them seem (and probably feel) stupider
First of all Word and Excel aren't that easy to learn. You can type up and save a document no problem, but learning how to do precise formatting tasks is something that takes good doc people a long time to learn how to do properly. Furthermore, doc people have to understand the thing they're documenting, which is not that much easier than writing the code in the first place, for lots of projects. Thus it's a fallacy that doc people aren't technical enough to learn good version control.
Secondly, CVS is pretty easy to learn. 99% of working with CVS is two commands: cvs update and cvs commit. Even assuming you don't go with one of the many adequate-to-kickass gui clients for CVS (my fave: TkCVS, maybe even in Windows), your learning curve for what most of the doc people have to know is pretty low. You can send 5% of the team to a class or buy them a book, there are ALWAYS people who want to know just a little bit more about the system they work with, and those people can be the gurus for the rest, who really just want to write doc.
Finally, if you try to implement version control without SOME education about what's going on behind the scenes, you will have lots of minor tragedies. Version control isn't perfect, and you do not want people to think that it is, or they will lose even more of their work when it's not. You want them to know things like: If it's not checked in, it doesn't really exist. Always do diffs see you can see what changes are actually going in when you merge a new version. Let your teammate know when you have to work on a document they already have checked out, or you'll clobber each others' changes.
These kinds of things are obviously harder to learn than the mechanics of actually using the version control system. They have to be, or the coders I work with wouldn't make these mistakes so frequently. Since you have to spend some resources giving your doc department this kind of education anyway, you might as well spend some time teaching them how to use a good doc system as well. CVS is truly one of the best.
--
If they're so dumb (Score:2)
If you can come up with a simple start-up situation for them (this is how you save, this is how you restore, and this is how to get a not-so-recent version), then they'll be happy.. When they get used to it, and want the more complicated stuff, then you can teach them that part of what the system is capable of doing for them.
Just don't expect them to learn the whole thing from get-go, or they'll run screaming. Teach them what they ask for, and then shutup.
For the curious few, you can teach them more (as much as they're willing to learn at the time). They'll be happy to teach that their colleagues when the time is right. Just be careful with the dangerous stuff ("this is how to delete an entire branch ---- oops.").
--
Document management is sooo much more.... (Score:2)
I have designed and build several of these type of systems. You generally don't hack together a document management system in a weekened. If you think so, then research your users requirements a bit further.
Faced similar problem - chose Xerox's Docushare (Score:2)
Engineering and Prof Svcs continue to use cvs for source code control and Docushare for all documentation that isn't part of the product itself.
http://docushare.xerox.com [xerox.com]
-Hal Incandenza
Re:"Good Developers" can just slap on a front-end. (Score:3)
first ask yourself... (Score:2)
I work in documentation. I've used a bunch of different version controls systems and performed needs analyses a few times when companies I worked for were switching over to "integrated" systems for development and version control, and someone decided that docs should go along for the ride.
Before you decide what system to recommend, you *have* to know what problem you're trying to solve. If the documentation is a mess, is it because you have a million old version sitting on your intranet, or even in binders? Is it because nothing is indexed so you cannot find the information you need, even though you suspect it exists? Is it written for a mess of different audience types? (PHBs, engineers, end-users, marketing?)
If any of these are the root of the problem, any type of version control is not the answer. What you need is a document repository. This can take the form of an intranet (MS's server happily displays office docs, though it will eat tons of drive space) a shared directory, or even a designated bookshelf where all the docs go.
Training whoever produces these docs on whatever system you use probably won't help, because it sounds like the real issue is that they haven't grasped the whole process of information management, notwithstanding whatever tools you provide to them to make that easier.
So, send these people to an information architecture course, or if that doesn't help (since there are a lot of not-super competant people working in docs, I hate to admit) get them an admin or a student co-op whose sole purpose in life is to keep track of these things. It may be a longer-term cost that telling them to use CVS, but it will actually work. Trust me, the problem isn't that they lack the tools. Good documenttion can be produced by quilting, if you're product cycle is long enough. The problem is that your docs manager (if you have one) doesn't understand how to manage information and keep it current. You should get a new one, or send the one you have to training, if you think they're redeemable.
If you are required to come up with some kind of software solution, here are some pro's and cons of systems I have used:
CVS: My personal favorite, I've used it on both UNIX and through a Web front-end. Pros are that it's incredibly flexible, free, and easy to migrate from if you switch to another system later. Cons: You can't diff unless it's a plain text file, and if you throw a lot of office or Frame documents in there, you better have a *lot* of disk space. Plus, we had a lot of resistance from people who had never used versioning. It is surprisingly hard to see the benefits, for a lot of people. We constantly had people writing over files, checking in old verions, and causing all kinds of havoc. Plan on having someone administer to this kind of thing if your user base is how you described it.
Visual Source Safe: Stay Away! I've used this in conjunction with an MS intranet server, which carried a large mixed bag of html and office documents, mostly posted by folks with very little technical knowledge. It was a mess. We ended up devoting a lot of our time just administering the thing, which is not as easy as microsoft likes to claim. and slow Slow SLOW. Sometimes I could hardly believe it. On the plus side, you can visually diff office documents, if you have the patience.
StarTeam: This system is weird. Docs went along with the rest of the engineering group on this one, but it will seem awkward for people unfamiliar with versioning, and it's very quirky. Its collaborative features, like file locking and bug tracking, are really awful.
And remember, version control makes a lot of sense for code development, but not usually as much for docs, because:
There are a lot of documentation consultant companies out there, some of which are very good, and most of which will happily outsource your whole doc department. Consider that option too.
Good luck.
OpenOffice 6.0 (Score:2)
Perforce (Score:2)
Re:I think you're missing the point (Score:2)
Well put. MS word has facilities for versioning, specifically the version number, approvals/proposed revisions, and other stuff. I haven't used it extensively, but a google search will probably turn up a tutorial or ten. You could also invest in one of the multitude of books about MS word on the shelf at your local B one of them must discuss versioning. Lastly, check out the commercial offerings posted elsewhere here.
-bluebomber
PVCS (Score:3)
Re:Shameless plug (Score:2)
It works through a web interface, although one that's rather prone to working well only with IE, ugh. I'm not sure how well it would work with Linux boxen -- I'd ask, but then do a large amount of testing to be sure it worked properly.
I have the same problem involving CAD documents (Score:2)
In anyone has any advice relating to this I'd greatly appreciate it. Currently we have an old proprietary system that we seem to not have the source to - even though it was made in house.
Have you checked out WinCVS? (Score:2)
Re:Commercial DM Products (Score:2)
Version control?! (Score:2)
Isn't it okay for people to randomly change business specs on the fly during a project so that there is a near zero chance of success? =P
Plus I like to dig through 30 different docs with different names modified by different people with insanely long filenames.
Gotta go someone just changed the spec without telling anyone again...
Long live document anarchy!
=)
E.
Roll your own? (Score:2)
In the original, the file was a medical record sort of thing, so it was a doc with a name made up of pateint ssn number, doc ID, and a time/date string. The file names got stored in the db, and the db could launch w to open them via a command.
The only thing I could think of is maybe a widget or macro in word to save the current date/time as part of the file name, along with the user ID, so that as soon as word opened the file, it would be saved with the new file name with the updated data in the name.
Wait, as part of the command line, you can set it up to copy the file to a new name with the propwer date and time info in the name, then feed the string so you have a proper record of it in the database, and then open it in your word processor. This maintains an audit trail, since everytime a doc is opened, a copy is made and saved. You could also include a delete option for just for folks just looking around.
So this is not to hard to do via the database of your choice. Although you better have a lot of drive space for the clueless. A really messy solution, but if you have space to waste, semi-workable. This would work for in house, not webifried.
You would have to add a layer in there some place for converting to web format, depending on the word processor you use.
Check out the Vinny the Vampire [eplugz.com] comic strip
Re:Roll your own? (Score:2)
That the version tracking can take palce inside the document, or outside the document. Version tracking can be engineered outside the document if you have many types of data and documents. Then you will need a database to control the varios parameters of the documents such as file names, number of copies, who has access, etc. This can be a problem, but it is not outrageous, depending on the capabilities of the technology at hand. I would think ity is certainly within the capabilities of perl and the table technology of your choice.
Check out the Vinny the Vampire [eplugz.com] comic strip
Re:I think you're missing the point (Score:2)
Doesn't software still require all of those things? I think we call it a System Change Request (SCR) or just Change Request (CR) process.
We have implemented it so that if our customer requests a change (software, user manual...) then it goes thru the same process. Docs also get stored in the same tool as source code. Goes back to the theory of "one process, one tool, many products".
As far as making simple so that non techies can use it...ummm...I have two recommendations (facts really). 1) non techies are going to have to step up to the plate and use the tool. 2) make the process such that it can be adaptable and easy to use (hold their hand step by step).
Pretty soon you will be at CMM 3.
dman7
My company uses Livelink (Score:2)
Documentation system... (Score:3)
As a systems engineer for a gov contractor, I needed to make a survey for documentation revision control software. Due to my CS background, I immediately thought about CVS, sourcesafe, and other source code revision control packages. The source code revision control packages were not up to the task of documentation revision control. Since systems engineers deal with requirements, and requirments are stored via documentation, every requirements analysis package comes with documentation revision control software. These packages also help to create documents from templates and databases. Depending upon your needs, these documents keep track of customer requirements from concept through delivery/installation, into maintenance.
There are many systems engineering tools that handle requirements analysis in various ways. Many of them work with MS Office (prolly what your writers currently use), and have built in versioning control. My best source for information on these tools was at the INCOSE tools website [incose.org]. This sites lists tools and checks them for the following features...
Store standard document outlines - used as starting points? User definable templates or modifiable?
Produce architecture views from functional and object oriented (OO) perspectives? Examples: WBS, functional , physical, data flow, state diagrams
Support various physical architectures? (View from a number of levels, Black box, Rack, circuit board, chip)
Enable tailoring to specific standards and requirements, IEEE, ISO, MIL-STD?
User friendly & menu driven (drag and drop capabilities)?
Support a single user or multiple concurrent users?
Input document change / comparison analysis
Visibility into existing links from source to implementation--i.e. follow the links.
History of requirement changes, who, what, when, where, why, how.
Baseline/Version control
Access control (modification, viewing, etc.)
. Support of concurrent review, markup, and comment
Multi-level assignment/access control
Plus many more features.
It is my opinion, that the following packages are up to the task:
Cradle
Doors
rtm
rdt
Caliber rm
If you go the page that I linked, it provides an in depth review of each of these tools plus many more.
LyX and CVS (Score:2)
Believe it or not, that's the combo we use. Works well, and LyX even shows the expanded $Id$ on the front page...
As much as I hate to say it... (Score:4)
Microsoft Visual Source Safe.
It stores versions, has a nice, friendly, Explorer-like interface, and runs on windows. Sounds like that's all the management wants. As long as they don't want to branch documents (which I recall being a bit of a bitch), they should be fine.
(All of this with the note that I'm *pretty* sure that VSS handles binaries alright, even though it may not be able to do such things as diffs, even on files in a proprietary format from it's own company.)
Re:"Good Developers" can just slap on a front-end. (Score:2)
Best Practices Available (Score:2)
--
DocsOPEN (Score:3)
Basically, instead of saving your files to a drive, Docs comes up and asks you to fill out a form. The file gets saved somewhere mysterious, the data goes into a serachable database, and you get on with your life. Later you call it up by searching for it, and you can add a new version when you save it. It's pretty neat. However, it is hard to get people to use unless you beat them over the head.
MS Office document version control (Score:2)
First, if all you're after is the ability to accept/reject changes to documents, then the simplest solution you want is to turn on the "Track Changes" option. You can leave the master "track" on, then either display or not display changes in either printed or written form. Also, you can selectively accept those changes to the base master document. All changes are also tracked by author, time/date, etc.
Second, if what you're after is protection of source files, then you really are going to have to implement a VCS for your writers. I've used a number of them, and fundamentally, none of them are easy for non-technical people to use without training/etc. Perforce and Clearcase both work reasonably well for cross-platform doc checkin, but, well, they're not exactly the easiest thing in the world to use. A simpler solution might be to have a decent server, and have it incrementally back up _every_ change to the doc section. You'd be surprised how easy that makes it to roll back changes. If you're also willing to be a little anal in the server setup about who owns files, it should become fairly easy, fairly quickly to resolve ownership/etc. issues.
Third, I noticed you were also talking about PDFs, and how the documentation was difficult to access. Two points here. The first is that you probably should stress to much about storing output (PDFs, generated HTML) in VCS, rather than storing source files. If you can store source properly, storing generated doc (PDF/HTML) becomes _very_ easy. The second is that making your documentation easy to access is an entirely different issue from puttingg it in a VCS. That's a matter of putting together a clear, logical structure that makes sense.
As a case in point, one of my clients about a year ago had _all_ of their documentation stored in VCSs. Their problem was not that it wasn't protected - it was that, in the course of 2 mergers, and going through 4 different writing teams, they'd _lost_ their documentation source. So before I could write word one, I had to go find the docs, sort out the doc structure, and get everything so it could once again be found and accessed.
One relatively easy-to-use structure that I've seen involved differentiating the source vs. output, and making the output available on an internal webserver. Output was automatically regenerated on a regular basis (simple scripts took care of most of it, or you can use the monkey model, and do it manually), so it was available to the entire company. Those people who needed to mess with the source files were taught how to use the VCS.
Re:PVCS (Score:3)
I used PVCS at my last job. My group developed and deployed on Solaris 2.6, even though almost everybody had NT boxes on their desks. We chose to use the Unix version of PVCS, which nominally ran on Solaris, but actually didn't. It actually only ran on HP-UX. We ended up with a really half-baked system where HP-UX machines running PVCS and our development machines NFS mounted a partition served by an Auspex NFS server. We'd log in to the HP-UX machine, set DISPLAY to desktop machine, and check-out into the NFS-mounted partition. Then, we'd log in to the Solaris development machines to develop, compile and unit test.
PVCS for Unix is an absolute and utter mess. DO NOT let yourself be contaminated by the bug-ridden and awful terror that is PVCS. The GUI (Java app) is a heinous, bug-ridden piece of excrement with incredibly poor user interface. The GUI is slow, and has so many human factors problems that developers occasionally checked-in files in the wrong directory. The GUI doesn't do a good job of mapping "archives" to directories, and it doesn't map from directory to "archive" at all. This leads it to offer you a selection of "Makefile" archives, and you get to choose which one. You must understand the internal structure of the PVCS archive directory tree to choose the right archive. PVCS seems to think that all files have an "extension", whatever the hell that is, and if you have files (like "Makefile") that don't have an extension, it won't expand RCS keywords ($Id$, etc). The back-end is equally buggy. We had a more-or-less half time PVCS administrator, whose job included deleting lockfiles that PVCS would occasionally leave around.
The company I used to work for chose PVCS for two public reasons: (1) it's nominally multi-platform (we found out that it isn't, Merant only claims it is) and (2) it had really, really elaborate management interfaces into the bug lifecycle. This makes managers happy, but leads to lots of ugly "WTF do we do with this bug report?" meetings at the end of a development cycle, the day before a release.
I strongly reccomend you avoid PVCS.
MS SourceSafe vs. ClearCase by Rational (Score:2)
For ease of use and based on cost I'd have to say, for your application (as much as it pains me) Microsoft SourceSave would be a good choice.
Just my 2 cents
--CTH
--
HTML is your friend. (Score:3)
If you need powerpoint-type presentations, Flash [macromedia.com] is easy to use, fast, and readable on nearly all modern browsers. You can even generate it with PHP [php.net] or PERL [twoshortplanks.com].
Think Weblication.... (Score:3)
If you need an easy to use UI, take a look at Xerox Docushare [xerox.com] or perhaps if you want to lean toward groupware look at Amphora [amphora.ee].
"A microprocessor... is a terrible thing to waste." --
Ultra simple CVS client (Score:3)
Check out TortoiseCVS from the CVSHome [wincvs.org] website. It's an add-on to Windows Explorer that adds status dependent color shading to CVS controled directories and context sensitive commands to the Explorer file menu. Comes with a bundled SSH client for secure tunneling.
Easy to install and VERY easy to use, and no, I don't have anything to do with the project. I just use it.
Re:Shameless plug (Score:2)
Re:Commercial DM Products (Score:2)
SharePoint [microsoft.com]
Art At Home [artathome.org]
A thought... (Score:2)
They refer to it as "document management." While a big part of that is also a matter of FINDING the documents (think of alllll that paper), the most challenging part remains tracking changes. You might want to contact a company that I've dealt with in the past who makes a fantastic system (sorry, it's not open source) called iManage. It's overkill for what you need, but you may be able to show them another market they haven't thought of, and develop something with them to suit your purposes.
And no, I don't work for iManage, have any stock (if they are traded, even), or anything of the sorts. I just really liked the product when I helped implement it a few years ago, and know it does a good job.
WinCVS and Chora (Score:2)
I have to admit, though, that while the non-technical people at our company have managed to learn how to use CVS, it was not without a lot of struggle on their part. Maybe SourceSafe or Perforce would have been a better choice. Even so, what we have now works and people have become familiar with it, so I don't think we'll change.
personal experience w/ StarTeam (Score:2)
Sometimes it really sucks ass and eats a file or shows you the wrong status, but for the most part, it's great, and it's much more prone to error on the user side than it is with the client or server, and in most cases we have artists, not writers or programmers to blame for that.
--
Re:PVCS (Score:2)
Everything he says is true, except we actually did get it to work under Solaris (if you can call what it does "working".) It comes with a set of command line tools that were apparently first written when DOS was new and a GUI that is fitted to the system badly on top of them. You'll find that if you use the command line tools exclusively, you'll confuse the GUI, as the command line tools don't have any concept of the GUI's "folders." So those you'll have to edit by hand. However, you can't limit yourself exclusively to using the GUI, as it's not functional enough and it's intolerably slow. Hit ctl-C at the wrong time when using any given command, and you've got to call an admin to cleanup the lock files it will leave behind.
My favorite indication that the Intersolv developers have absolutely no Unix clue: The GUI is very windows-flavored. Run the GUI on a serious unix system, and it will stat each and every filesystem to build a filesystem display. Whee! Watch all of those automounts kick off! See the NFS traffic fly! Or simply go to lunch and hope it's done when you return.
Re:Binary File Version Control - problems with it (Score:2)
That's not true. I used xdelta [berkeley.edu] on two ~180 MB binary files that were quite different, and it made a ~500K patch.
Ryan T. Sammartino
Re:some nifty scripts to disguise cvs as file shar (Score:2)
-----------------
Re:Perforce (Score:2)
The GUI interface is almost impenetrable (based on previous experience with SourceSafe and PVCS), so I wouldn't recommend it for non-techies.
-----------------
Easy! (Score:2)
2) Teach them CVS
haha!
I'm only partially joking however. That would be SO NICE.
Regardless, CVS handles binaries (albeit inefficiently) and isn't that hard to use with wincvs.
--jeff
Teach them CVS, then. (Score:2)
Don't bother trying to get an "easy to use" solution, which will take up valuble system resources just to run. Anything worth doing can be done from something simple like CVS.
If they are really so dumb that they CAN'T learn CVS, well, its time to tell them that they are in the wrong field, and that they should consider a carrer in garbage collection or burger flipping. Tough love oughta get them educated real fast.
Do you need versioning? (Score:2)
You probably need to have someone in control over the mess, and if that person can manage a directory tree which is read-only to everyone but the archivist you could easily keep a "history" directory beneath each leaf and stick old versions down there (with filenames containing the date of the revision). That would give you lots of history and a ready means of re-organizing if you find a tool that lets you manage things more easily in the future.
--
Having 50 karma is an itchy feeling; I know I'll get
An increasing problem... (Score:3)
It can be very embarrassing to have all of the private comments revealed to the other party when you didn't realize there were there. Increasingly firms are checking for these things as well.
Word 2002, from the Office XP suite, includes a Security Tab on the Options settings. In there you'll find a Privacy section which gives you checkboxes for things like deleting personal information on save and "Warn before sending, saving or printing" a document that has file revision tracking turned on.
-Coach-
Cobalt (Score:2)
Re:"Good Developers" can just slap on a front-end. (Score:3)
Two weeks, one office employee: $600.
Five weeks, 3 part-time developers: $59,000.
Q4 earnings time, the look on the boss's face: priceless.
There are some things money can't buy. For everything else there's Corporate Mastercard.
--
"Fuck your mama."
CVS Will work but... (Score:2)
The source code at that point was revision controlled in SCCS (yuck) but the documentation was not. Seems very much like your situation I guess.
Putting the documentation in a CVS project along with the software itself would work pretty well, except that CVS will not do delta's on the binary files but will store each revision pretty much in it's binary format in the repository. This is not a big problem as disk space s cheap.
Also, teaching people how to use some windows CVS client shouldn't be a big issue I believe. However, I've found in the past that people often *want* the GUI version first and within a few weeks are asking me 'so how can I use that command line version you're using'.
Personally I like to teach people a few very simple things: