Category Archives: Technology

Never – it’s still to early to be late.

It seems my article on life caching struck a chord with someone. I’m glad someone at least shares my understanding on this need. He is just starting to search and find his lifestreaming needs and tools, and this really is the best time to start.

I left him this comment on his site (though his comment system mangled my formatting and I’ve fixed it here):

I’m glad you enjoyed my article – I have a few follow-ups to that article that I need to pound out – such as different ways of actually caching the data. I can say reading a couple of your earlier that you are not late to the life streaming party.

While I agree life streaming is useful, important, and mostly neat. Life caching is the evolutionary transition that will need to occur as the data portability movement takes place. Tools unfortunately to do any type of life caching are terrible at this time. I have some wordpress plugins that I ‘m cutting my PHP learning teeth on that have lots of bugs that I need to streamline – if the object of most of these plugins came from the lifestreaming design instead of spam blog design (repurposing RSS feeds can go both ways) then this small faction of belief in saving our own data can grow.

Dang – sometimes I’m just too long winded.

P.S. Lifestreaming is becoming a verb maybe in a few more years it will be up there with w00t and in the dictionary.

On another one of his posts I left this comment:

I’ll give you the fact that your earlier thoughts on life streaming is correct – it is too much information.

That being said – you need to look at who this is for. Most people don’t care about everything I do and I’ve pruned my RSS feeds that I give to the public down so they only get the useful information.

I archive and save this data for myself – and like the pictures people take of their children growing up, there is never a thing as too much information.

It may be too much information at this point in time – but when personal data mining takes place in a few years as the next hot trend – and archive of this data will be very useful to find trends of different points of your life that you may otherwise forget (or wished you would have forgotten but google reminds everyone anyways).

Over at the lifestream blog the author linked to this original article from here – but while sees usefulness in life caching still believes in the importance of life streaming.   I don’t think you can truly do one without the other, but with this newer concept catching on there is no good way to do it yet.  I left him this comment:

 We can agree to disagree on the importance of life caching – but without accurate and long term archiving of data (life caching) – the lifestream setups most people currently will only be fleeting since most use RSS feeds in a transitional phase of their life stream.  As the RSS feed items expire they are removed from the lifestream.   Even sites like dandelife.com that does the best job of storing data for a lifestream only keeps transitional data from the auxiliary streams and eventually it expires.

I really hope the industry catches up with this idea – I’ll get at least one more of those promised life caching articles out this week.

So your company has a requirement to maintain log files for a year?

You don’t know how to go about it and you need to implement it now?

I have a solution for you and best of all it’s free. This solution however is not supported by me, there will be no bug fixes by me, and any damages you cause to your own servers is your own fault. That is my one sentence disclaimer to tell you that you truly are on your own.

For this solution or temporary fix (depending on your organization) you are going to need the following helper programs:

Info-zip – we’ll use this to compress down files and save space specifically we need the zip.exe file

MD5SUMS – this allows us to generate MD5 checksums to verify if any file tampering has taken place after the fact. Specifically we need the md5sums.exe file.

Dump Event Log (Dumpel.exe) – this is a tool offered by Microsoft to dump your event logs to a text file. Though this link is part of the Windows 2000 Resource Kit I have tested and it does work with the Windows XP and Windows Server 2003 log files.

We take these 3 programs and wrap them to work together via a batch file. All 3 of these programs MUST be in the same directory as the batch file for it to work as designed. Here is the batch file:

@echo off
REM Sets date variables for file name
for /F “tokens=1,2″ %%d in (‘date /T’) do set day=%%d & set date=%%e
set yyyy=%DATE:~6,4%
set dd=%DATE:~3,2%
set mm=%DATE:~0,2%
set startDate=%yyyy%-%mm%-%dd%
REM Adds Computer to prefix the date
set outputname=%computername%-%startdate%
REM Cleans out previous zip files from a bad run
del %outputname%.zip
REM Dumps each of the log files going back for 2 days
REM allowing for overlaps we may miss due to time changes
dumpel -f %outputname%.sec -l security -d 2
dumpel -f %outputname%.app -l application -d 2
dumpel -f %outputname%.sys -l system -d 2
REM creates an MD5 hash for verification checking
md5sums %outputname%.sec >%outputname%.md5
md5sums %outputname%.app >>%outputname%.md5
md5sums %outputname%.sys >>%outputname%.md5
REM Compresses the 4 files
zip %outputname%.zip %outputname%.*
REM Cleans up the unneeded files to save
del %outputname%.sec
del %outputname%.app
del %outputname%.sys
del %outputname%.md5

I’ve included my comments in the batch file – but let’s go through it a section at a time so you can fully understand it.

@echo off

If you don’t know @echo off supresses everything from your screen in a batch, I wouldn’t suggest modifying my script since this is batch file programming 101.

for /F “tokens=1,2″ %%d in (‘date /T’) do set day=%%d & set date=%%e
set yyyy=%DATE:~6,4%
set dd=%DATE:~3,2%
set mm=%DATE:~0,2%
set startDate=%yyyy%-%mm%-%dd%
REM Adds Computer to prefix the date
set outputname=%computername%-%startdate%

This section adds the prefix to the files we are going to be using on all of your files – these allows us to work with files that include the computer’s name you are running this on and the date on which it was run.

del %outputname%.zip

This verify actually cleans up the zip file if this script has already been run during the current day. Mine is modified to delete the zip file completely since at the end of my script I move my files to a remote location and don’t need archived logs filling up my hard drive quickly.

dumpel -f %outputname%.sec -l security -d 2
dumpel -f %outputname%.app -l application -d 2
dumpel -f %outputname%.sys -l system -d 2

This area does the physical dumping of the logfile. Dumpel is the command. The -f switch allows us to specify a file name. If you notice I used the %outputname% as the first part of the file with the file type of which log file it is as the suffix. The -l switch let’s us specify which logfile we are dumping from the event log (security, application, or system). the -d switch allows us to specify how many days we wish to save. I chose 2 days to allow some overlap on the log files which is good for security reasons since we shouldn’t miss any events if you change the time of the day the script is run. It also give us two more logfiles to verify the authenticity of the log data we are looking at.

When this section is done running you should have three files. If your computer’s name was BOB-SERVER and the date you ran this was one January 3, 2007 the file names would read like this; BOB-SERVER-2007-01-03.sec for your security log, BOB-SERVER-2007-01-03.app for your application log, and BOB-SERVER-2007-01-03.sys for your system log.

md5sums %outputname%.sec >%outputname%.md5
md5sums %outputname%.app >>%outputname%.md5
md5sums %outputname%.sys >>%outputname%.md5

This section generates an MD5 Hash of the logfile data allows you to see if the data was tampered from when it was originally generated. It is next to impossible to edit a file and maintain the same hash data. This allows you some security that your log files are authentic. For those wondering “well can’t I just rerun the md5 program and generate a new hash and save that after modification?” – I have your answer. I didn’t include how to store these files after they are generated and we will touch upon that question under “What do you do now?” at the bottom. The command outputs your three BOB-SERVER-2007-01-03 files and output it to a single BOB-SERVER-2007-01-03.md5 file that includes a section with each of the above files. I decided personally that I didn’t need an md5 file for each of them – feel free to modify this if your needs differ.

zip %outputname%.zip %outputname%.*

This compresses the dumped events logs down to a manageable size. I managed to get a 60 MB log file that I generated during varying testing phases down to just over 6 MB. I also manged to get 480 KB of log files down to 14kb. At this point you should have a BOB-SERVER-2007-01-03.zip file which includes your three event logs and you md5 file.

del %outputname%.sec
del %outputname%.app
del %outputname%.sys
del %outputname%.md5

This section cleans up the files outside of the zip. I manage t0 get these files down 90% in size I don’t need these to eat up extra space.

What do you do now?

From here I would add in a line at the end to move the logs to another server where you can store them for length your organization deems necessary. To help combat the MD5 re-engineering I mentioned above I would copy the compressed archived to two locations on your network. This will help make having an MD5 meaningful. Another option is adding a script that e-mails you the MD5 hash so you have it saved for reference. Having the MD5 and collecting 2 days of information from the logs and would mean an attacker may have to edit 2-4 archives and regenerate md5’s for them – double that if you store a second set of archives in another location.
While this may fulfill your needs for log file capturing and an easy way to store them, it does not address the fact of easy log file auditing and tracking down events. There are all in one solutions out there for you to use and I don’t say in any terms this a solution to those. You need to address your own needs and decide what works for you. This is to give you sometime until your decide what you are going to do.

For years there has been a myth going around online.  The detrimental  myth that hurts many and makes them feel inferior.   The myth that HTML is very easy.  I read many books on the history of technology and most of them going into the HTML revolution.   The fact that HTML allows everyone to participate online and create their own content.    The fact that it is so easy that even the family dog can do it.

Before I get much further I would like to say that I know enough HTML to get myself by.   The learning though has been more from the sense of need then actually looking at a book and having it all “click”.  Usually I’m looking to do something specific and I find the answer.   There is a few commands that I know about that I haven’t even used.   I prefer CSS though over HTML.

Most people that interact online use very little HTML.   They use WYSIWYG editors that do all the formatting.  Adding links is as simple as highlighting a line, clicking the link button, and typing in the destination of the link.   I don’t use HTML for this.   For some reason all technology books and classes preach the long lasting myth that HTML is what you have to learn.    This unfortunately is a lie.

Web Architects – they do have to know HTML, but content providers have been long past the time where HTML skills were required.   If my sister wanted to start a blog, she wouldn’t bother to learn html.  If my grandmother got the point where she decided that she wanted to post her recipes online she would be discouraged by ever looking at HTML code.

Knowing HTML is a benefit but not a necessity.  It also is not simple pie publishing that anyone can learn and understand.   It’s understanding the logic and taking the time to learn the codes.   This time takes away from the ability to actually create content for a creator (if there creation is not strictly or primarily HTML).   People waste time on things they don’t need too when learning a good CMS system would  be sufficient.   The next step if you had to learn something would be CSS to style the CMS system the way you want it to look.   Then at this point learning HTML may be a benefit.

The teachers however train you that HTML is the first thing you have to learn.   They waste time for skills that may never be relevant.   Unfortunately these skills are a complete waste of time and are lost in relevancy immediately when they start to put those skills into practical use.   The similar scenario that I can put into the educators false belief in this system is similar to how programming was when I was in school.   When I went to college in the 1995 school season the major programming language at the time was C++ for the computing community.   The highest level of programming my college taught CS majors was PASCAL.   PASCAL was on the way out as a programming language at the time.  Already it had seen it’s hey day and was in decline.   I’m not arguing that it wasn’t an important skill to learn, I am however wondering how that skill would have gotten me a job.

Needless to say I never took a computer programming at college.   I also have never taken an HTML class, but I’ve used the web successfully and created varying levels of content on it since 1994.    These skills are also not in my skill set for work either.   Though I work in computer security I’m not a programmer nor do I ever plan to be.    I started programming with the origina “anyone can learn it” programming language, BASIC.

I hated BASIC on my VIC-20 with a passion.  Anything to do with typos took hours to correct after taking hours to input in the first place.   This early interaction made me hate programming with a passion.   My brother however at least embraced HTML and web design technologies.   This will help make him a web architect, but not a content provider.   HTML is for the rigid people that like order.  They make sense out of the chaos.   Content creators on the other hand learn enough web languages to make their content viewable.   Creators are much more chaotic then designers.

I am a creator and I say HTML is overhyped.   It does allow me to be more flexible in some areas, but that’s only to bring some order ot the chaos.   I prefer to start with the chaos and work from there.

We could always  go back to the myth that you need to understand math to use computers, but that’s a rant for another day.

Sometimes there are videos or sites with sounds in general that office workers can’t view or watch to their fullest because of sound requirements.  Why can’t we “watch” or “track” a video we could watch when we get home -without giving it a “digg”?

Giving the story a Digg implies to all of our friends that we liked the story.  To those that pass our RSS feeds onto our friends that could give a false sense of how good a clip is.    What we need is something we could put into a private store essentially.   By moving it into a private store it allows Digg to hide this from other users and not promote a possible unworthy story.

What I envision is a link on the actual article page (not the article listing page) that allows you to “track”, “bookmark”, or whatever other term you wish to use to mark this to your profile.   There is no reason you should have to use a third party site such as del.icio.us or youtube favorite or “fill in the blank” to perform this feature for you.  It could even be considered a private invisible digg.

There is plenty of room on the profile tab for this to fit, so it wouldn’t break the design. Having this feature seen only by the profile holder would keep this as another option for gaming the system this includes no RSS feeds for these items.   All in all I can’t figure out a real downside to this.

Please Digg help office workers find the videos they want to watch later without giving them a possible unworthy digg.

Life Caching:

Life caching is setting up sites that you have complete control over to save data from sites that you only have varied levels of control. Getting all of your meta data in one place. Saving each detail of data in it’s place so it’s saved, used, and recyclable. Life caching is the next stage as the Data Portability Group moves forward. This is not the goal of the Data Portability Group – it is just what their goal enable you to do. The work however is burdened on to you and I can say there is no easy way to do this and some data leakage and loss will always slip through cracks, at least in this stage of the game.

Isn’t this what Life Streaming Accomplishes? How is Life Caching different?

Life Streaming is the step before life caching. While the concepts share alot of overlapping the simplest scenario is that a life stream is a picture in time that does not save your data. A life stream is ephemeral and actual current implementations are very fragile. I have a life stream here. RSS feeds expire so data is lost, companies go out of business so the links it points to is gone, or data just gets missed. But to truly get a better picture of life streaming here is what the life streaming blog says about it:

What is a Lifestream? In it’s simplest form it’s a chronological aggregated view of your life activities both online and offline. It is only limited by the content and sources that you use to define it. Mine is available here. Most people that create them choose a few sources based on sites that track our activities such as Del.icio.us (bookmarking), Last.fm (Music we listen to), Flickr (photos we take) etc…Then you can either find software to host your own, or find sites that provide a platform for you.

Many people have been writing about Lifestreams and the potential value they offer for ourselves and others. Some of those people are Jeff Croft, Jeremy Keith, and Emily Chang. It appears to be a concept that is gaining quite a bit of steam.

I was inspired to create a blog for the Lifestream concept after doing a little research which I wrote about on my blog. Most of the information I found was pretty scattered and there wasn’t a central repository of resources so I thought I should create one. I feel that beyond the self expression of allowing people to track their actions in a passive manner there will be many more exciting technologies that will surface from the backend data aggregation that can occur from people supplying this information.

The rub is that 99% of life streams only save the links of the RSS feeds and do not save the actual data. This is inefficient in design because like I said before data get’s lost for various reasons. Life caching however has the prime goal of saving that individual data for your use and your manipulation. This gives you freedom to do with what you want. Take your data anywhere and everywhere, do with it what you will.

How is this different from the Data Portability Group?

In some aspects, like the concepts of life streaming, life caching shares a few steps in common with the Data Portability Group. What the Data Portability Group means to give is methods and standards that give you tools to do with your data what you will. However, this doesn’t actually mean you will do anything with it or that there will be a standard out of the box configuration for you. The responsibility is on you to act and use these tools that will hopefully emerge.

The Data Portability Group is key for this going forward and allowing you to withdraw your data from the sites that were previously walled gardens. After the garden gates are finally thrown open you have the freedom to do with the data what you will. Please put this power to good use.

Why Do I care?

You should care because this is about you. It is who you are. It does not specifically define you in any ways and most people would understand that it’s a complete picture of you. There are however aspects of you that you may want to share at a later date. The stories your grandmother told you will get fuzzier over time. Hopefully the idea of life caching which is still in it’s infancy will lead to life story archives that the generations after us can learn from. Our grandkids will be able ot mine the data and read the stories you want to pass on.

Will those after you care that you listened to Fallout Boy on June 7th, 2008? Maybe not, but maybe your grandkids will discover similar music tastes with you. It will give them an understanding of who you are. It will also give them ways to identify with you in a way that you could never identify with the pilgrims that came across on the mayflower.

What do I save?

The ideal answer is everything. I would say between the RSS streams I save and the email I collect I am almost up to a 90% efficiency of collecting my personal data online.

To give you an example:

This may seem like a lot of data. It is, but it’s also what we deal with in a normal consuming internet fashion. I don’t use the tools that save which applications I’m running and I’m looking for something like last.fm for movies so it’s more automatic – but that will come in time.

Via e-mail I save my phone calls, my bills, banking history- all this can be stored offline and databased in the home. Your own personal Google for yourself should be the end goal of life caching.

Doesn’t this make it easier for companies to mine data about me?

Yes the google monster is omnipotent. Anything you share online can be snagged up and archived away by google. Is this a good thing? Maybe or maybe not. There is no reason you would need to make most of this data public. You could set up to store this data in email archives, private data sites, or personal home encrypted databases. Life caching is not about displaying your life. It’s about having control over it and saving it for a future date.

As the Data Portability Group expands they hope to implement permission controls for the data. This will help prevent against data mining to some extent. The only true answer is that if there are things that you don’t want anyone to know about do not place them anywhere that is publicly accessible or in the hands of any company or person other then yourself.

How do I store and backup the data?

There are many ways. I use WordPress with a variety of plugins to maintain all my data on the site. I also use quite a bit of feedburner kung-fu and gmail filters. The key thing is that I can extract this data into other formats from just those two methods. I could dump it into a personal database or wiki. The tools are only at the beginning of stages to make this useful for you. It is easier to back it up before you lose it then to want it after it is gone.

What Can’t I backup?

In an ideal world there is nothing that you can’t backup. We don’t however live in an ideal world. Mostly the limitations deal with which sites give you some form of access to your data. Some don’t allow you to take friend’s list with you. Other sites don’t allow you to get posts out unless you implement site scraping which could break the terms of service you agreed to.

The limitation is in the tools and the agreements and the Data Portability Group is helping lead the future in developments that will allow you greater access to your own data.

How do you share with your friends?

Beyond having a public blog which your friends may or may not visit there are multiple ways. I have two major RSS feeds coming out my website. The public RSS feed gives everyone a filtered feed of my posts. This way they don’t get spammed with every song I listen to on last.fm or every single story I digg when it happens. This RSS feed then goes and notifies my twitter friends that I’ve posted something new that I find relevant. It also goes out and feeds the stories to tumblr, jaiku, and facebook. It is also the feed that my RSS readers get.

The secondary feed goes to feedburner and gives me a post to email option. This allows me to save via an email archive all of my daily posts so they are searchable through gmail for myself. Users could subscribe to this feed if they asked me, it’s just the amount of data can sometimes be overwhelming and I’ve had a few complaints from a couple of twitter friends.

From my wordpress blog I post to other blog sites. For example when I finish and publish this post it will also be posted at my msn spaces accounts, my old blog at blogspot, vox, xanga, myspace, livejournal, and dandelife. So no matter where you have friended me you can get notifications that I’ve published and posted something.

Finally in some of the message boards I use my signature contains a java script that rotates my 5 newest stories so people can read the headlines and click if they find it interesting.

Do you truly think that this is the future?

Yes, your data is you and part of you is also your data. Hopefully the stories we wish to pass down can be archived, saved, and cached for all to read and consume for generations to come.

Final Notes:

I hope this explanation is relevant for you and that you have interesting in preserving your own data. Each of the links in this article will help you with different aspects of your design. If you have further questions or need some details expanded please leave a comment or contact me so we can hash out ideas and clarify them.

For those heavily interested I would recommend posting and devising ways that you can cache your online and offline life. Work with the Data Portability Group on tools to make this work. The most important thing is to only deal with companies that allow you to do with your data what you want and place it where you need it. Thank and support the companies that do.

I managed to block a lot of the spammy categories from hitting the RSS feed.  This means that it should be a cleaner read with less waste with just the stuff I want to save.    Tonight or tomorrow I’ll be applying the same filters on the front page of my blog.

This migration to wordpress has been just one hurdle after another

It’s been giving me duplicate entries……………………..grrrrrrrrrrrrrrrrr

Currently I’ve managed to disseminate my blog articles to several different blogs (coming in how creeva.com works part 3).  One thing I’m looking for is a way to sync photographs from multiple sites.    Like using wordpress as my front end to multiple sites, I truly love flickr and it’s tools and plan on always using it as my primary photo storage site.   I could manually upload to other sites I use, but what fun is that.   Making everything work behind the scenes by itself is one of my sick little pleasures.

We have two possible solutions to make this work – client side or server side.

Client side I could download and manually upload all the photos to each site I possibly use and use third party tools to manually sync all the photos to each individual site I could possibly want the data to be uploaded to.  This option I may break down and do, but like I said I don’t want to.

Server side – what I want.

I want to be able to upload the data to Flickr and set the data to automatically be synced to other sites across the net.   Some of the sites I would like to sync data with is photobucket, myspace, facebook, and anything where I can get a use out of it.

Does anyone know of a tool that exists that can sync form flickr to other services.

According to this article I found on Digg. Nintendo Technical Support had the user smack the Wiimore against the users palm to fix it. Now as foolish as this seems to some people. Compaq had an issue years and years ago where the hard drives installed in them did not include enough oil. Their techinical support people had you life your computer 2-3 inches above your desk and drop you $2000.00 investment. So this type of scenario is not new. I know I fixed many things from computers to cars by hitting them once in awhile.

read more | digg story

Ok so I’m finally happy enough with the migration that I moved all the DNS settings over to my new web host and turned the old creeva.com back to creeva.blogspot.com.   As they say, breaking up is hard to do.   Hard for me since I had some functionality in the old blog that I was missing (until earlier today) in the new blog.

First of all, obviously I have moved to a sparser design.   I like zen-like simplicity.   No distractions and the meat in front of you on the plate.  Part of this reason is that moving from the blogger platform to the wordpress platform I couldn’tuse the same themes and I was being lazy when it came to the idea of converting the theme.   After being distraught and having issues over this fact I then decided I would make a new layout.  After coming to this conclusion I became happier and more excited about the new layout.   There are some more things to do but that will come with time.

The basic design things you will see is less widgets on the front page (and no advertising currently) I moved some of the functionality off to sub pages (something blogger didn’t support).

My subpages are across the top they are:

Random Quotes    -  These are things I’ve collected over time (this page may not make the grade long term)

About Me   -  A Random Self Observation
Security – Some of the Security I’ve enjoyed that I wrote myself

Music – Not finished – but is going to old information about the bands I play in

Photos – My photo album (sourced at flickr)

Videos – Videos I’ve made or uploaded – or just videos I like

Contact Me – My contact information

Links – Links to friends/

It’s late and I’ll get to writing up part 2 of how the new creeva.com works tomorrow.  The next part of this mini-series is which wordpress plug-ins I am using.