Data Visualization Techniques 1
Data Visualization Techniques 1
Bar Charts....................................................... 3
Scatter Plots.................................................... 4
Pie Charts........................................................ 6
Visualization Velocity...................................... 10
Decision Trees................................................ 14
Mobile......................................................... 17
Conclusion.................................................. 17
         4
                                                                        the best possible visual based on the data that is selected.
                                                                        The visualizations make it easy to see patterns and trends
Bar Charts                                                    Another form of a bar chart is called the progressive bar
Bar charts are most commonly used for comparing the           chart, or waterfall chart. A waterfall chart shows how the
quantities of different categories or groups. Values of a     initial value of a measure increases or decreases during
category are represented using the bars, and they can         a series of operations or transactions (see Figure 3). The
be configured with either vertical or horizontal bars, with   first bar begins at the initial value, and each subsequent
the length or height of each bar representing the value.      bar begins where the previous bar ends. The length and
                                                              direction of a bar indicates the magnitude and type
                                                              (positive or negative, for example) of the operation or
When values are distinct enough that differences in the
                                                              transaction. The resulting chart is a stepped cascade that
bars can be detected by the human eye, you can use a
                                                              shows how the transactions or operations lead to the final
simple bar chart. However, when the values (bars) are
                                                              value of the measure.
very close together or there are large numbers of values
                                                              }
(bars) that need to be displayed, it becomes more difficult
to compare the bars
to each other.
                                                                  Bar charts can be configured
To help provide visual variance, bars can have different
                                                                  with either vertical or horizontal
colors. The colors can be used to indicate such things as         bars, with the length or height
a particular status or range. Coloring the bars works best
                                                                  of each bar representing the
when most bars are in a different range or status. When
all bars are in the same range or status, the color               value.
becomes irrelevant, and it is most visually helpful to
keep the color consistent or have no coloring at all.
Figure 3: This bar graph  a waterfall chart  is used to represent the relative contribution of each category to the
total.
Scatter Plots                                                 Once you have plotted all of the data points using a
A scatter plot (or X-Y plot) is a two-dimensional plot that   scatter plot, you are able to visually determine whether
shows the joint variation of two data items. In a scatter     data points are related. Scatter plots can help you gain a
plot, each marker (symbols such as dots, squares and          sense of how spread out the data might be or how closely
plus signs) represents an observation. The marker             related the data points are, as well as quickly identify
position indicates the value for each observation. Scatter    patterns present in the distribution of the data (see Figure
plots also support grouping. When you assign more than        4). Scatter plots are helpful when you have many data
two measures, a scatter plot matrix is produced. A            points. If you are working with a small set of data points,
scatter plot matrix is a series of scatter plots that         a bar chart or table may be a more effective way to
displays every possible pairing of the measures that are      display the information.
assigned to the visualization.
In a scatter plot, you can also apply statistical             Bubble Plots  AScatter Plot Variation
analysis with correlation and regression. Correlation         A bubble plot is a variation of a scatter plot in which the
identifies the degree of statistical correlation between      markers are replaced with bubbles. In a bubble plot (see
the variables in the plot.                                    Figure 5), each bubble represents an observation. The
Regression plots a model of the relationship between          location of the bubble represents the value for two
the variables in the plot.                                    measured axes; the size of the bubble represents the
                                                              value for a third measure. These plots are useful for data
                                                              sets with dozens to hundreds of values or when the
values differ by several orders of magnitude. You can   bubble plots are a good way to display changing data
also use a bubble plot when you want specific values    over time.
to be represented by different bubble sizes. Animated
Figure 4: A scatter plot is a good way to visualize relationships in data.
                                                             }
next to each other. If you do use pie charts, they are
most effective when there are limited components and
when text and percentages are included to describe the           Pie charts are most effective when
content. By
providing additional information, information consumers do       there are limited components and
not have to guess the meaning and value of each slice. If        when text and percentages are
you choose to use a pie chart, the slices should be a
percentage of the whole (see Figure 6). When designing           included to describe the content.
reports or
Figure 6: A pie chart helps you compare the percentages of different components..
Figure 7: Alternatives to pie charts include line charts and bar charts.
Of course, there are many other chart types you can                Volume refers to the size of the data.
use to present data and analytical results. The
                                                                   Variety describes whether the data is
selection of charts usually will depend upon the
                                                                    structured, semistructured or unstructured.
number of categories and measures (or dimensions)
                                                                   Velocity is the speed at which data pours in
you want to visualize. By following the tips outlined here
                                                                    and how frequently it changes.
and understanding the examples, you may need to try
different types of visuals and test them with your
audience to make sure the correct information is                  Building upon basic graphing and visualization
being conveyed.                                                   techniques, SAS Visual Analytics has taken an
                                                                  innovative approach to addressing the challenges
                                                                  associated with visualizing data. Using innovative, in-
Visualizing Big Data                                              memory capabilities combined with SAS Analytics and
Big data brings new challenges to visualization                   data discovery, SAS provides new techniques based
because of the speed, size and diversity of data that             on core fundamentals of data analysis and the
must be taken into account. The cardinality of the                presentation of results.
columns you are trying to visualize should also be
considered. One of the most common definitions of
big data is data that is of such volume, variety and
velocity that an organization must move beyond its
comfort zone technologically to derive intelligence for
effective decisions.
Large Data Volumes                                          data you wish to examine and
One challenge when working with big data is how to          then, based on the amount of
display results of data exploration and analysis in a way
that is meaningful and not overwhelming. You may            data and the type of data, it
need a new way to look at the data that collapses and       presents the most appropriate
condenses the results in an intuitive fashion but still
displays graphs and charts that decision makers are
                                                            visualization.
accustomed to seeing. You may also need to make the
results available quickly via mobile devices, and
provide users with the ability to easily explore data on
their own in real time.
Figure 9: This box plot compares the distribution of data points within a category.
10
Different Varieties of Data (Semistructured                   Another visualization technique that can be used for
                                                              semistructured or unstructured data is the network
and Unstructured)                                             diagram. Network diagrams view relationships in terms of
Data variety brings challenges because semistructured         nodes (representing individual actors within the network)
and unstructured data require new visualization techniques.   and ties (which represent relationships between the
A word cloud visual (where the size of the word               individuals, such as friendship, kinship, organizations,
represents its frequency within a body of text) can be        business relationships, etc.). These networks are often
used on unstructured data as a way to display high- or        depicted in a diagram where nodes are represented as
low-frequency words (see Figure 10).                          points and ties are represented as lines.
                                                              Network diagrams can be used in many applications and
SAS Visual Analytics takes the concept of word clouds a       disciplines. For example, businesses analyze social networks
step further. It takes advantage of taxonomies and            to understand their interactions with customers, while
ontologies to make associations and then organizes words      counter- intelligence and law enforcement might map a
into topics based on how the words are being used. SAS        clandestine or covert organization such as an espionage
Visual Analytics word clouds display the hot topics of the    ring, an organized crime family or a street gang. You can
day gleaned from this                                         also superimpose the network diagram on a map, for
text analysis. End users can drill down by clicking on        example, to show the relationship or product sales across
the individual topic to see exactly what words or             geographic areas (see Figure 11). Word clouds and
phrases comprise a particular topic.                          network diagrams are currently available in solutions
                                                              such as SAS Text Miner and SAS Social Media Analytics.
                                                              Visualization Velocity
                                                              Velocity is all about the speed at which data is coming
                                                              into the organization. The ability to access and process
                                                              varying velocities of data quickly is critical. A correlation
                                                              matrix combines big data and fast response times to
                                                              quickly identify which variables are related. It also shows
                                                              how strong the relationships are between variables. SAS
                                                              Visual Analytics makes it easy to assess the relationships.
                                                              Simply select a group of variables and drop them into a
                                                              visualization pane. The intelligent autocharting function
                                                              displays a color-coded correlation matrix that quickly
Figure 10: A word cloud shows the words or                    identifies strong and weak relationships between the
phrases associated with a topic.                              variables. Darker boxes indicate a stronger correlation;
                                                              lighter boxes indicate a weaker correlation. If you hover
For example, you could use the topic cloud to                 over a box, a summary of the relationship is shown. You
categorize customer comments on Twitter about your            can double-click on a box in the matrix for further details.
products or services and then click on a topic to drill
down to see the actual comments.                              Figure 12 displays 45 correlation calculations on slightly
}
                                                              more than 1.1 billion rows of data. This graph shows the
                                                              correlation values, and returns results in two to six
                                                              seconds using the SAS LASR Analytic Server. Previously,
    While visualizing structured data is                      this type of calculation would take hours. Now it can be
    fairly simple, semistructured or                          done in seconds. By using box plots and correlation
                                                              matrices, SAS Visual Analytics can help speed up your
    unstructured data requires new                            analytics life cycle because analytical modelers can
    visualization techniques, such as                         perform variable reductions more quickly and efficiently.
Figure 12: In this correlation matrix, darker boxes indicate a stronger correlation; lighter boxes indicate a weaker
correlation. You can double-click on a box for further details.
18
                                                            Figure 14: This overview axis bar chart shows the high
                                                            cardinality in this big data more clearly. You can easily scroll
                                                            through the entire chart.
Cardinality becomes a concern in big data because the       But what if the filter isnt meaningful or it skews the data
data may have many unique values per column. The            in undesirable ways? One way to better understand the
example in Figure 13 shows only 128 unique cities.          composition of your data is through the use of
Because you cannot see the labels for each bar, the         histograms. Histograms provide a visual distribution of
graph becomes less meaningful.                              the data along with cues for how the data will change if
Imagine if you had a million bars! It would be              you filter on a particular measure. Histograms save time
impossible to see them.                                     by giving you an idea of the effect the filter will have on
                                                            the data before you apply it. Rather than relying on trial
SAS has adopted a method for dealing with high              and error or instinct, you can use the histogram to help
cardinality in SAS Visual Analytics  bar charts that       you decide what to focus on.
provide an overview bar that zooms into the bar chart and
enable information consumers to scroll through the entire   Data Visualization Made Easy With
chart. The level of zoom can also be controlled. If you     Autocharting
compare Figure 13 to Figure 14, it is easy to see that
Figure 14 presents the information more clearly.            In SAS Visual Analytics, intelligent autocharting produces
                                                            the best visual based on what data you drag and drop
                                                            onto the visual palette. It is important to note that
                                                            autocharting may not always create the exact
                                                            visualization you had in mind. In that case, you also can
                                                            select a specific visual to build. However, when you are
                                                            first exploring a new data set, autocharts are useful
                                                                                                                 19
because they provide a quick view of the data. You
then
                                                           also required and the visual will be a line graph (see
                                                           Figures 1 and 2). If the category is geographic, then a
have the ability to switch to another specific visual as   map will be displayed (see Figure 18).
desired. For example, with autocharting, when a single
measure is selected, distribution of that measure is
shown (Figure 15).
Can You See Into the Future?                                   When additional measures are added to the forecast (as
                                                               shown in Figure 23), three things happen in SAS Visual
Forecasting estimates future values for your data based        Analytics:
on statistical trends. As such, it is an extremely
                                                               1 Each variable is evaluated to determine
important tool for organizational planning. Fortunately,
                                                                 whether it influences the forecast. Variables
SAS Visual Analytics can help you expand the culture of
                                                                 deemed to be influencers are added to the
forecasting in your organization. Easy-to-use capabilities
                                                                 bottom of the screen for simulation purposes.
take the complexity out of forecasting, so that users of all
skill levels can see for themselves what might happen in       2 When influencers are found, the forecast is
the future.                                                      recalculated and refined. As you can see, the
                                                                 confidence interval (light blue bars) around the
                                                                 forecast in Figure 23 is much tighter than in Figure 22.
A simple menu guides users through the process of
generating forecasting results. Select the date, time or       3 Users can manipulate the values of the influencing
date-time data items you want to use for the forecast.           variables to see the potential impact on the forecast,
The software automatically chooses the most appropriate          in effect by performing simulations.
forecasting algorithm for the data chosen. You also have
the option to select the forecasting intervals. When you
click OK, a line chart is created, along with a clear
explanation of the forecasting results in the what does
it mean section at the bottom of the screen, as shown
in Figure
22. This is just another way SAS Visual Analytics brings
advanced analytics to nontechnical users in an
approachable format.
Figure 22: With automated forecasting capabilities, SAS Visual Analytics chooses the most appropriate forecasting
algorithm for the selected data. What does it mean? (bottom) provides explanations of analytic functions and data
correlations,
so even nontechnical users can understand what the data means.
Figure 23: By adding additional measures, these underlying factors are evaluated as to their potential impact
on the forecast, the forecast is recalculated accordingly, and users can use these additional values to perform
simulations.
                                                               audience may dictate which visualization you present. In
Mobile                                                         the latter scenario, showing your audience an alternative
Growing employee adoption of mobile devices means              visual that conveys the data more clearly may provide
that businesses need to deliver company information to         just the information thats needed to truly understand the
these devices at any time and from anywhere. SAS               data.
Visual Analytics comes with SAS Mobile BI to allow
businesses to give front-line and mobile employees
access to business intelligence. With SAS Mobile BI,
employees can look at a vast array of different types of
company business intelligence reports, KPIs and
dashboards on their mobile devices. Rather than having
to wait until they get back to the office, mobile users
can quickly and easily gain a deeper analytical
understanding of
business performance.
Conclusion
Visualizing your data can be both fun and challenging. It
is much easier to understand information in a visual
compared to a large table with lots of rows and columns.           Analytics, download white papers,
However, with the many visually exciting choices                   view screenshots and see other
available, it is possible that the visual creator may end
                                                                   related    material,   please    visit
up presenting the information using the wrong
visualization. In some cases, there are specific visuals you       sas.com/visualanalytics. Or, try SAS
should use for certain data. In other instances, your              Visual Analytics for yourself at
                                                                   sas.com/vademos.
                                                                And products such as SAS Visual Analytics can help provide
                                                                the best, fastest visualizations possible. The solution
                                                                enables you to explore all of your data using visual
                                                                techniques combined with industry-leading analytics.
Visualizing your data can be both fun and                    challenging.       It isasmuch
                                                                Visualizations such      box plotseasier     to understand
                                                                                                   and correlation  matrices      in
                                                                help you quickly understand the composition and
                                                                relationships in your data.
     You can choose the most appropriate visualization by        The net effect is the ability to accelerate the analytics life
     understanding the data and its composition, what            cycle and to perform the process more often, with more
     information you are trying to convey visually to your       data. Users can quickly view more options, ask more
     audience, and how viewers process visual information.       questions, make more precise decisions and succeed faster
                                                                 than ever before.
To contact your local SAS office, please visit: sas.com/offices
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks
of SAS Institute Inc. in the USA and other countries.  indicates USA registration. Other brand and
product names are trademarks of their respective companies. Copyright  2014, SAS Institute Inc. All rights
reserved.
106006_S120359.0514