Archive for December, 2004

Projects and Keeping Busy

Sunday, December 19th, 2004

It’s been a while since I’ve updated this blog and I’ve spent more time working on my own blog engine to make up for issues that I’ve faced with Word Press thus far. I hope to have this up over the Christmas break along with about a dozen other projects that are nearing completion.

I now have quite a few technology related blogs going strong on my feeds website: http://www.web-ology.com/feeds/ All in all, there are over 23,000 feeds posted and it’s growing strong. Even though I work at an ISP, my feed aggregator is starting to grow to the point of taxing resources. I’ve tried to only run the feed aggregator at a mininal but I’ve started experiencing the loss of collected news by not pulling them more.

Along this journey, my project has grown from using feedonfeeds to writing a hybrid news aggregator of my own design. While feedonfeeds works excellent, I’ve started needing features that it simple doesn’t allow me. Since it is based on the MagpieRSS engine, it was easier to write my own around MagpieRSS and that was the path I took.

One major feed that I needed was the ability to weight a feed and to schedule feeds to be pulled a different intervals. Most of my feeds are updated once a day or less then a dozen times a day while some others may be updated hundreds of times a day. More and more feeds are only displaying the last ten or twenty items and this can be an issue if the feed is not pulled quickly enough. Some websites such as Slashdot has started suffering feed pull overload so they will now ban your script if you try to pull it too frequently. Thus another reason to write my own. My new system allows more control of how often feeds are pulled and these factors along with a dozen or so more went into this new design.

On the code generartion front, my system is mature enough to generate fulling working website admins. I’ve been playing with a few new designs for how to lay out the overall framework. For my feed scripts I’m using more of a publisher model that turns websites back into static html instead of dynamic solutions. Thus far to get a good pattern down has been a challenge but I’ve made some good headway. I’m able to publish these changes and the system allows for remote publishing over FTP which tries to do an intelligent job of only pushing what’s changed. This can be important when trying to push large websites of 1300+ or in this case 23M+ worth of pages.