Home | Intro to SNA | Start-Up | SNA datasets in UCINET and NetDraw
Datasets are like storage jars

SNA datasets in UCINET and NetDraw

SNA datasets (as opposed to ‘data’) comprise collections of separate files that mesh together through direct, text-based linkages. They are, basically, relational databases. For SNA a ‘network’ is a set of ties for a single, defined relation. Thus, for example, the Krack-Sociom example contains three defined relations (Advice, Friendship and Reports To) and may be spoken of as three (social) ‘networks’. Most (non-SNA) researchers would see the office as a bounded and defined social ‘setting’ (ethnomethodology) with each actor involved in potentially multiplex relations with the others. By contrast, SNA analytics treat this as a multi-relational dataset.

A UCINET dataset is based on at least one network relation – tie data – and this initial, working element is a ‘dataset’. Any working dataset requires two system files with the extensions: ##h and ##d. If there is connected attribute information related to the nodes (referred to as node data in NetDraw and attribute data in UCINET) this will be in an additional working dataset with its own ##h and ##d system files. The UCINET convention is to include ‘-Net’ and ‘-Att’ in the names of the two connected UCINET datasets. If we have both tie data and node data I will refer to it as a’ rich’ dataset.

The conventions of a sociomatrix (UCINET Data ->Display)

UCINET data displays always show a matrix. Tie data always displays as a sociomatrix. Sociomatrices are a particular kind of matrix where the row and column headers contain the same labels. They are always square. Set theorists call them adjacency matrices, a 1 (digital tick) says the node in the row header is adjacent (linked to) the node in the column header. UCINET will always display the contents of its working files as matrices or, with tie data, sociomatrices. Display is always available in main ribbon but is accessed at Data ->Display or with Ctrl-D hotkey.

Here is the display for the tie data of the Krack-Sociom-Friendship relation.

Note that a sociomatrix defines the full network space. It specifies every possible (directed) tie and also the reflexive ties (self-loops) in the diagonal. The full number of directed ties (plus reflexives) is the number of nodes, squared. The reflexive ties, the diagonal, are the same as the number of nodes. Hence the number of directed (non-reflexive) ties is N x (N-1).

The three relations are based on directed ties. This means that ties from A to B do not imply that there must be a tie from B to A. For relations where this implication is valid we would clean the data so that every tie was also present in the reciprocal cell. UCINET refers to this as a symmetrical sociomatrix.

To read a sociomatrix you read along the rows: Social Actor #1 goes for advice to #2, #4, #8, #16, #18, #21 (6 people altogether). Actor #2 goes to 3 people …. Read the rows of the friendship matrix for friendship nominations.

In a sociomatrix the columns contain the contacts coming in. In Advice we see that #1 and #2 both have many inward nominations (13 and 18).

NB: With 2-mode data the ‘sociomatrix’ display becomes a rectangular membership matrix. Conventionally the names of persons are the row headers, the organisations (or other grouping labels) are the column headers and entry in a cell (and its symmetrical/ reciprocal cell) indicate that that the person/node in the row header ‘affiliated with’ (a member of) the organization/group in the column header.

What do you see in a (NetDraw) network diagram?

The spring-embedded layout procedures produces order in a network diagram. In essence, they gather the most connected nodes into the core and place peripheral nodes in zones where they have some connections. You will have seen the power of the layout procedure when you selected the friendship and Reports_To networks and used the layout procedure.

Network diagrams can be seen as road maps. They place nodes and show possible paths to get from one to another.

Graph theorists see diagrams differently. They are conceptualising graphs as objects in 3 or even n-dimensional space. If you look at the diagram as 3 dimensional the lines are now edges, and the nodes are simply the places where the edges meet, the vertices (the singular is vertex). Graph theorists talk of vertices and edges, rather than nodes and ties. They refer to directed ties as arcs.

The terminologies of SNA (nodes and ties), graph theory (vertices and edges) and matrix operations intermingle in the literature. You need to be able to think in all three.

[Prell. P.9. graphs and digraphs]

Ties, di-ties, edges and arcs

SNA tends to use the term ‘ties’ to mean any connection between two nodes. However, within NetDraw and UCINET the software reads a tie data record as a directed, FROM TO, tie. I will, therefore, refer to our basic unit of observation (and record) as a di-tie – short for directed tie.

Graph theory mostly thinks of ties as undirected ‘edges’ but has the term ‘arc’ (think of an electrical spark) instead of directed tie. Much of its software will, by default, read an edge record as two directed ties/arcs. In terms of this treatment it would be better to call edges, bi-directional ties, and I will sometimes do this.

[Prell. P 11; Arcs involve senders and receivers. Good point!; ties create paths… but she does too much.] [Prell p.13: goes to matrices. Adjacency (i.e. focus on nodes); binary; symmetrical, reciprocal (reciprocating/mutual), intensity. i.e. this is the language of mainstream SNA.)

The VNA (Visual Network Analysis) tie data format: (EdgeArray1 format)

The VNA tie data format was developed for NetDraw. Prior to that time UCINET used the graph theoretic term ‘edgelist’ to describe this format and that term remains in UCINET.

VNA tie data format arranges the di-ties in consecutive records reading across the rows of a sociomatrix. The first di-tie is the di-tie FROM  Node #1 TO Node #1 (a reflexive tie; graph theory – self-loop), the second, FROM Node#1 TO Node#2, the third Node#1->Node#3 and so on.

A massive advantage of having the data in tie data list (edgelist) format is that we can have and array of multiple columns of information about each di-tie, thus the UCINET designation ‘EdgeArray’ with the 1 appended to indicate 1-mode (not 2-mode) data.

The array of extra columns may be about different relations (i.e. a multiple relations, or multirelational dataset) and it can contain scales or measures of ‘tie strength’ for a variable (valued data). It is crucial to have clear and precise column headers (with no white spaces).

Keeping your raw data in a tie data list allows you to record all types of qualitative and quantitative data related to the di-tie identified by the initial two columns.

Let’s examine data in this VNA format in the Krack-Sociom AllRels dataset. From NetDraw open the Krack-Sociom AllRels dataset (in the Demo Datasets)

In NetDraw go to File ->Save data as -> vna. Use the browse window to navigate to your working folder. Give the file a temporary name (Krack-AllRels Temp) and save it. Right-click on this saved file and use Open with… Notepad.

The *node data and *tie data sections of this file contain the data from the UCINET dataset arranged in NetDraw’s VNA (Visual Network Analysis) format. (The *node properties and *tie properties section record the display properties of the diagram – they backup this info when you are working with a file.) We have not loaded any node data so the *node data section has only one column of info, the node IDs (which are system numbers). The tie data has taken the rows of each matrix collected them, and written them out cell by cell: FROM Al-1 To Bru-2, Advice (tick), Friendship (tick), Reports_to (tick); FROM Al-1 TO 4, Advice (tick), Friendship (tick) Reports_to (no tick) and so on…

It is often useful to take text output from UCINET or NetDraw and organise it in an Excel worksheet. This allows you to manipulate whole columns of and rows of data. You use the basic copy and paste (clipboard) functions of the Windows operating system. Note: Mac, iOS functions seem to operate differently.

From the NotePad file you can select all the tie data (The column headers – FROM TO ADVICE… are a bit tricky, leave them in or exclude them as you wish), copy it and paste it into Excel. Create a temp Excel sheet and use the Import wizard (Delimited -> comma).

Insert a row and add the column headers from the data or just clean up the copied row.

I recommend managing network data with the VNA tie data list format. It fits well with Excel’s spreadsheet procedures. I have also humanised this standard dataset by giving all the nodes a humanoid ID. The original system number is tagged behind. Thus Social Actor #1 is Al-1 [Al = Albert], Bru-2, Chas-3, Dave-4 and so on. The demo datasets have been amended this way.

The NetDraw manual describes the (tie data list) VNA format. The UCINET manual uses the graph-theoretic terminology of ‘edgelist’. Thus, what you get from NetDraw Save Data as ->vna is an edgelist1 (or Edgearray1) format in UCINET’s DL Editor.

Reading tie data into NetDraw If you have a list of ties and no inventory of the nodeIDs, you can feed the list into NetDraw and it will populate the ID column of the *node data section. The VNA input file has to start with these lines: *node data ID *tie data FROM   TO [Enter your list] When you use File ->Save data as ->VNA you get a list of all the nodeIDs in the node data section. You can take this out and use it as the basis of your node data list.

NB: If you want to see the original UCINET dataset supplied with the program it is a file called Krack-Hi-Tech. (The node IDs are numbers, not names.)

UCINET’s data editors: The DL (Data Language) editor (for tie data)

UCINET data editors have improved dramatically over the last few years. They allow you read data in from your Excel worksheets and save it immediately as UCINET datasets. Thus you can manage your data in Excel worksheets, one for tie data and one for node data and read these directly into the data editors.

In this exercise we will take the (cleaned) raw data from Krack-Sociom AllRels and create a UCINET dataset of the tie data using the DL editor. Open UCINET and set the default folder to the Krack-Sociom All Rels folder in the Demo Data sets.

In UCINET go to Data -> Data editors. I recommend that you use the DL editor (which will read but not edit an Excel worksheet) and avoid the Excel editor (which will write to the Excel worksheet). If you want to change the Excel worksheet, do it with the Excel program.

Open the DL editor, and use File -> Open ->Excel file. You also have to set the filter to Excel (programing oversight!). Locate the Excel file Krack-Sociom AllRels.xlsx and select the tie data worksheet (OrigTieData). The DL Editor reads that worksheet. You may need to enable Col resizing to adjust the display or, alternatively, use the auto option in the Edit menu. (NB: Comments added to cells in the Excel document do not interfere with this process.

You now need to tell UCINET how the data is arranged. A VNA tie data list is an Edgearray1 as indicated. The explanation after the label gives the row format, ego (a node ID), alter (a second node ID) and then data about any number of relations until the end of the row is reached.

Use File -> Save as a UCINET dataset and provide a name. I suggest Krack-Sociom AllRels.

Open the dataset with NetDraw to see the diagram and use Display to see the sociomatrices. (NB: The DL editor is able to handle white spaces left in the column headers –I am not sure about Node IDs.)

Connecting node data to a UCINET (tie data) dataset

NetDraw uses node data to set the display properties of nodes. Thus you can colour the nodes by same attribute (blue for F and pink for M), set the shape of the node or its size.

NetDraw stores information about nodes in its Node attribute editor. You access this through Transform ->Node attribute editor. NetDraw, but not UCINET, will handle character data. To feed this into NetDraw I recommend you create a *node data file. I have done this for the Krack-Sociom datasets and you see it in the higher level folder. It is a simple Notepad (*.txt) file with just one command line (*node data) and the initial column header of ID (or id).

When you have your tie data read into NetDraw you open this file as a VNA attribute file and NetDraw will tell you if the list of IDs match those in the tie data. You can then access this information in the dropdown list in the Nodes tab of NetDraw or in the node attribute editor.

Once you have the node data added you can experiment with colours and shapes in the network diagram.

Leave a Reply