Subject dBASE saved a WordPress project
From Alex Safian <noway@noway.com>
Date Sun, 16 Apr 2023 21:37:23 -0400
Newsgroups dbase.getting-started

My company was transitioning from a custom ASP databased site to a WordPress site, but the web developers missed that there were serious conversion issues for the site's many articles. One that I recall for Spanish words was the tilde symbol above n. It looks like the old editor used extended character sets which WordPress did not like, so that Spanish letter got mangled. Lots of other characters got mangled too.

There was also a strange WordPress "feature" that if it thought it was dealing with a text file it tried to mimic its formatting in HTML, meaning an EOL in the text file got turned into </br>. And the article in question was HTML, not text. I never did figure out why this sometimes happened.

We should have checked the conversion results, noticed the problems, gotten the devs to fix their script and run their conversion again, [LOOP] etc., until all the problems were fixed.

But before there were proper checks on the conversion, some of our people started modifying the organization of the site, adding to it, etc., and the problems weren't noticed for a number of months. Starting over with a better conversion would have meant that all that work would have been lost.

Looking for a solution I exported the WordPress site to a giant WXR file (special WordPress xml file that is essentially a backup) and could see the problems in the article sections of that file. The developers at the web company said it was a "fool's errand" to try to fix those issues while keeping the modifications our people had made to the site.

They were probably thinking about using Regex, and I tried that, but it quickly became way too complicated.

So dBASE to the rescue. I just started with the HTMLtoPRG form in the web dBase tutorial, which reads in a file and skips through it line by line using the low level file functions, and modified it.

For each line my modified program would check for the known problems and correct them, write it to the output file, and move to the next line until done. And after running the program I'd check the output WXR file looking for non-ascii characters and other problems, and if the problem was systematic, I'd just add the fix to the conversion program and run it again.

I also modified the site's CSS file and used the program to add some CSS code to <p> or <div> (I can't remember which) to fix paragraph spacing problems.

And that rescued the project, and the new site was able to go live. The power and flexibility of dBASE as a data manipulation tool saved the day.