The Data Types

gRSShopper incorporates a number of distinct data types. Some of these data types are very general, while others are very specific. Each type of data is related to the other.

For example, the gRSShopper harvester will harvest a link from a given feed. A person, if he or she has admin privileges, can transform this link into a post, adding his or her own comments. The post will contain information about the original link's author and journal.

Different data types can be seen by the site administrator from any Admin screen - they are all listed on the left hand side of the screen. Site visitors, on the other hand, will see whatever the administrator wants them to see.

Here, then, are the data types:


An author is a person who writes a web resource. Authors are the creators of links, which are harvested by gRSShopper, and of resources referenced by posts.

It should be mentioned at this juncture that posts - which are like blog posts, and are authored by the owner of the site - are unlike the posts you'll find on Blogger or WordPress. Each post contains a field for a link, author and a journal. The idea is that a person using gRSShopper typically writes about something, and so there is a primary reference for each post.

The author, as one of those referents, can be identified with a name and a URL. I'd like this (eventually) to be the author's OpenID website. For now, it's just the author's home page.

On this site, visitors can view a separate page for each author. For example, here is the author page on this site for Stephen Downes. That's me. On the page, if the site owner encodes the view appropriately, you can see a list of the posts which reference the author.


A box is a bit of HTML text that can be inserted into any page on the website with a simple command (see page commands).

Boxes are useful because they are cached by the system, reducing the overhead on the system. They can also be used to access the database, which allows live contents (for example, a list of pages) to be displayed.

To the left you can see an example of a 'Contents' box. This box is a simple list of all the pages on the site. The contents box need never be updated; pages will be added automatically as they are created. If you have Admin access, you can see the encoding for the Contents box here.


A cite is created automatically by the harvester. It is a record that indicates that one resource (as defined by its URI) has referenced (or 'cited') another resource (also referenced by its URI).

(Note that cites are currently disabled in the code while I work out a more efficient way to do them...)


An event is, as the name implies, a record that describes something that is scheduled. Hence, events have a 'start' and an 'end' field, which would indicate when they take place, as well as a location field, for location information.

The event data type is a work in progress. I would like to be able to read and write event metadata, not only in iCal but also in RSS and JSON. The idea would be that you can use this site to create events, and also import events listings, and in turn use the site to update external calendars.


A feed is a remote source of content. A feed might be a blog, a newspaper, a magazine, a journal, or a Twitter account. If new content is added to this location on a regular basis, it can be a feed.

gRSShopper has the option to harvest feed contents. The harvester is controlled by the gRSShopper admin. The harvester will only harvest approved feeds. The following types of feeds are defined:
Inactive - this is a feed with no feed endpoint (that is, no associated RSS or JSON data stream location)
- this is a feed that has an associated RSS or JSON data stream location, but has not been approved for harvesting (it might be a feed that was suggested by a user.
Approved - this is a feed with an associated RSS or JSON data stream location and has been approved for harvesting
Retired - this is a feed that was previously harvested, but where harvesting has ceased (it is kept because there may still be data in the database related to that feed.

Feeds have various values that control their processing. These are documented in Feed Rules.


A journal is a magazine or publication. It can be used in place of a feed if you wish, but journals are not harvested.


A link is a resource on the World Wide Web defined by a URL (or, in the future, by a hash). Links are typically created one of three ways:
- By the harvester - each individual resource harvested from a feed will be stored as a link by default.
- By a post - when a link post is created, the link it refers to is stored as a link
- By an alanysis tool - when contents are analyzed, links that are found are listed as links

Links are useful because they help associate one resource with another. Two posts referring to the same link, for example, may be associated through the graph.

Links decay over time. In the reader you can choose between 'fresh' and stale' links. Orphan links (which are not associated with any other resource in the graph) are automatically deleted.


Many types of data have properties that are predefined. For example, a 'post' might be a 'link' post, an 'article', an 'announcement', a 'shownote', or whatever.

These predefined types are defined in the 'optlist' table (one day I'll code a way to import optlists from standards organizations). Titles define where the optlist applies. The optlist titled 'post_type', for example, defines the possible types of posts.

Optlists are really useful for defining your own system of categorization.


A page is a static web document with an HTML address. You can create pages using gRSShopper. You are reading a page right now

Pages may extract data from the database and display it in different formats. There's a list of page commands that define how to customize page contents.

Pages may use predefined templates, or not, and may include boxes (defined above).

The purpose of a page is to provide a relaible web-based location for static or slowly changing content.


A person is a gRSShopper user on this site. Person's may access gRSShopper features by logging in.

There are three major types of person:
- Admin - this is a person who has access to the system and database
- Registered - this is a person who can subscribe to services such as an email newsletter
- Anonymouse - this is any other person


A post is the primary type of content produced by authors using gRSShopper. Any command that works in a page can work in a post. Posts are accessible as stand-alone pages. However the default URL for a post is defined by the post number, not by the 'location' parameter.

The following different types of posts are defined:
- link - this is a post that contains a reference to someone else's web page or post, including the title and URL. A link post will identify the web page author and feed using the graph. Normally the description will be displayed to readers; the description is your commentary about the other person's page.
- article - this is a post that you write and that does not refer to a specific web page (though it may 'cite' some links via the graph). The post content is usually displayed to readers. In the content editor the 'write' tab is used to create post content.

A most may be displayed in multiple formats using a 'view' (see below). This allows a post to be published as HTML, RSS, JSON or whatever all at once. Views for posts are named with reference to the post type. So, for example, an HTML link post will use the view 'post_link_html' while an RSS article post will use the view post_article_rss. Different pages will specifiy different views using page commands.


Special content with a way to embed YouTube Videos and Slideshare presentations. Also contains a player for audio recordings.

For videos, put the video ID (eg., 5ZmHRElAArc) in the 'Youtube' field.

For presentations, put the 'key' in the 'Slideshare' field.


This contains a reference to a publication. I use it to list my own publications for my CV, as it contains fields for journal name, volumns and number, pages, etc. As well, the 'post' field can be associated with a post containing the text of the journal article (as an 'article' post; if you do this, you can indicate publication information in the post).


Still working on this.


Still working on this.


This is a predefined bit of coding that can be used as either a page header or a page footer. They will be added automatically to the front and end of a page when the page is published.

Any command that works in a page also works in a template.

Templates are really useful not only for web pages but for things like RSS or Javascript pages.


Still working on this.


A view defines how to present an individual record. For example, the 'author_html' view will define a string of HTML (or JSON, or RSS) code to display an individual author record.

Views display the contents of the record by embedding field contents. Also, any command that works in a page works in a view, allong a view to contain lists of resources from other tables.