KEMBAR78
Elastic Stack Overview | PDF | Cloud Computing | Information Technology
0% found this document useful (0 votes)
34 views50 pages

Elastic Stack Overview

Uploaded by

srikanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views50 pages

Elastic Stack Overview

Uploaded by

srikanth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Elastic Stack Overview

The world’s most popular enterprise open source products


for real-time search, logging, analytics, and more
Agenda
• Elastic Stack Overview
• Architecture
• Demos: Logging, Search
• Logstash & Beats
• Elasticsearch
✦ The Distributed Model

✦ Text Analysis

✦ Search
✦ Aggregation

• Kibana
Once upon a time …
• As any good story begins, “Once up on a time...”
✦ More precisely: in 1999, Doug Cutting created an
open-source project called Lucene
• Lucene is:
✦ a search engine library entirely written in Java
✦ a top-level Apache project, as of 2005
✦ great for full-text search
• But, Lucene is also:
✦ a library (you have to incorporate it into your
application)
✦ challenging to use
✦ not originally designed for scaling
The Birth of Elasticsearch
• In 2004, Shay Banon developed a product called Compass
✦ Built on top of Lucene, Shay’s goal was to have search
integrated into Java applications as simply as possible
• The need for scalability became a top priority
• In 2010, Shay completely rewrote Compass with two main
objectives:
1. distributed from the ground up in its design
2. easily used by any programming language
• He called it Elasticsearch ... and we all lived happily ever after!
• Today, Elasticsearch is the most popular enterprise search
engine
85,000+ 100M+ 3,000+
Community Product Subscription
Members Downloads Customers

Statistics since 2012, founding of Elastic

7
Who is using Elasticsearch?
Tech

Finance

Telco

Consumer

Enterprise Customers in Every Industry

9
“Improving patient “Combating our global “Mining 3-4 billion “Many use cases from
care with real-time human trafficking events per day to trade optimization to
clinical decision problem.” ensure security compliance to HR
making.” intelligence.” recruiting.”

Solving Problems Beyond ‘Search’

10
Security

Alerting

Monitoring

X-Pack Reporting
Single install
Extensions for the Elastic Stack Graph
Subscription pricing
Machine Learning

12
Elastic Cloud
Hosted Elasticsearch & Kibana
Includes X-Pack features

Available in AWS today


Available in Google Cloud Platform (Beta)
Available as a private cloud/on-premise solution
(Elastic Cloud Enterprise)

13
Enterprise Deployment Architecture
Beats Elasticsearch

Master Nodes (3) Custom UI


Log Files Metrics

Logstash
Ingest Nodes (X) Kibana
Wire Data your{beat}

Data Nodes – Hot (X)


Kafka

Instances (X)
Datastore Web APIs
Redis
Data Notes – Warm (X)
Messaging
Nodes (X)
Sensors
Queue
Social

X-Pack X-Pack

LDAP AD SSO

ES-Hadoop
Hadoop Ecosystem Authentication Notification
Elastic Stack X-Pack Elastic Cloud

Application Search Log Analytics Security Analytics


Metrics Analytics Business Analytics Many more …

Solving many diverse & complex use cases


Demo:
Apache Logging
Logstash
Data processing pipeline

Ingest data of all shapes, Parse and dynamically Transport data to any
sizes, and sources transform data output

Secure and encrypt data Build your own pipeline More than 200+ plugins
inputs
Parsing Logs Using Logstash
Logstash Configuration Example – Apache Access Logs
input {
file {
path => "/Users/aquan/Desktop/JUG/demo/access_log"
start_position => "beginning"
}
}
filter {
if [path] =~ "access" {
mutate { replace => { "type" => "apache_access" } }
grok { match => { "message" => "%{COMBINEDAPACHELOG}" } }
geoip { source => "clientip" }
}
date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] }
}
output { elasticsearch { hosts => ["localhost:9200"] } }
Logstash Configuration Example - Spring Boot Logs
filter {
# If log line contains tab character followed by 'at' then we will tag that entry as stacktrace
if [message] =~ "\tat" { grok { match => ["message", "^(\tat)"] add_tag => ["stacktrace"] } }

# Grokking Spring Boot's default log format


grok {
match => [ "message",
"(?<timestamp>%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME})
%{LOGLEVEL:level} %{NUMBER:pid} --- \[(?<thread>[A-Za-z0-9-]+)\] [A-Za-z0-9.]*\.(?<class>[A-Za-z0-
9#_]+)\s*:\s+(?<logmessage>.*)"
]}

# Parsing out timestamps which are in timestamp field


date { match => [ "timestamp" , "yyyy-MM-dd HH:mm:ss.SSS" ] } }
Beats
Lightweight data shippers

Ship data from the source Ship and centralize in Ship to Logstash for
Elasticsearch transformation and parsing

Ship to Elastic Cloud Libbeat: API framework to 30+ community Beats


build custom beats
FILEBEAT METRICBEAT PACKETBEAT WINGLOGBEAT
Log Files Metrics Network Data Window Events

More than 30 community Beats Apachebeat, dockbeat, httpbeat,


and growing … mysqlbeat, nginxbeat, redis beats,
twitterbeat, and more
Elasticsearch
Heart of the Elastic Stack

Distributed, Scalable High-availability Multi-tenancy

Developer Friendly Real-time, Full-text Search Aggregations


Clusters, Nodes and Indices
Cluster my_cluster
Server 1

Node A
d1
d3 d6
d2 d1
d4 d7 1

d9 d8
d5
d12 d3 d6
d10
d1
d2
Index twitter d4
d5

Index logs
Split Indices into Shards
Cluster my_cluster
Server 1

Node A
d1
d3 d6
d2 d1
d4 d7 1

d9 d8
d5
d12 d3 d6
d10
d1
d2
Index twitter d4
d5

Index logs
Distribute Shards over Multiple Nodes

Cluster my_cluster
Server 2 Server 1

twitter shard 1 Node B Node A


d1
d6 d3
d2 d1 twitter
twitter shard 4 d4 d7 1
shard 0
d9 d8
d5
d3 d6
d12
d10 twitter
d2
d1
shard 2 d4

twitter logs d5

shard 3 shard 0 logs shard 1


CRUD
Text Analysis
Inverted Index
Most think of search as…

SEARCH
Multilingual

Full Text Search

Stemming

Type ahead
Mobile
Time Range

Geo search

Influenced by Rating
Personalized Ranking
Search

Pagination
Time range Filter

Numeric Filter

Geo range Filter

Stemming /
Highlighting
Demo:
e-Commerce Search
Search – Finding the Needles in the Haystack
• Relevancy – scoring of a document basedon how closely it matches the query

✦ TF (term frequency): The more a term appears in a field, the more important it is

✦ IDF (inverse document frequency): The more documents that contain the term, the
less important the term is

✦ Field length: shorter fields are more likely to be relevant than longer fields
• Structured Search
• Full-Text Search
Structured Search
• Answer is always “Yes” or “No”
• Does not worry about document relevance or scoring
• Filters – very very fast, easily cached, no relevance, use as often as you can
✦ Term Filter, Terms Filter – numbers, Booleans, dates, and text
✦ Bool Filter (compound filter) – must, must_not, should
✦ Range Filter – number, date (date math), string
✦ Exists Filter
✦ Missing Filter
• Filter Order – Important for performance
✦ More specific filters should be placed before less-specific filters
Full-Text Search
• Relevance
✦ The match Query
✦ Multiword Queries – Precision control
๏ Operator: and, or
๏ minimum_should_match
✦ Bool Query - Combining Queries
✦ Boosting Query – boost parameter
• Multi-field Search
✦ The multi_match Query
✦ Types: Best, Most, Cross
✦ Boosting Individual Fields - ^
Proximity Matching – Phase Matching
• Search for “sue alligator”
✦ Sue ate the alligator
✦ The alligator ate Sue
✦ Sue never goes anywhere without her alligator-skin purse
• The match_phrase Query
✦ Find words that are near each other – “quick fox”
✦ Closer is better
✦ Flexibility - slop
Partial Matching
• The prefix Query
• Wildcard and regexp Queries
• Completion Suggester
✦ Query-Time Search-as-You-Type
๏ match_phrase_prefix – “johnnie walker bl”
- slop
- max_expensions
✦ Index Time Search-as-You-Type – edge n-grams
๏ “quick” à q, qu, qui, quic, quick
๏ Storage vs. perfromance
Dealing with Human Language
• Language Analyzers - Many
✦ Tokenize text into individual words – Think about Chinese, no space
✦ Lowercase tokens
✦ Remove stopwords – a, an, and, are, as, at, be, but, for, if, into …
✦ Stem tokens to their root form – foxes à fox
• Synonyms – jump, leap, and hop
• Dictionary
• Typos and misspellings – Fuzzy Query
Real-time Reporting & Analytics - Aggregation
• Aggregations are a way to perform analytics on your indexed data
✦ Combination of buckets and metrics
✦ Buckets – Collection of document that meet a criteria
✦ Metrics – Statistics calculated on the documents in the bucket
• Example: Average salary per <country, gender, age> combination, in one
request with one pass over the data!
✦ Partition documents by country (bucket)
✦ Partition each country by gender (bucket)
✦ Partition each gender bucket by age ranges (bucket)
✦ Calculate the average salary for each age range (metric)
Aggregations: Count by Country
GET /person/person/_search?search_type=count
{
"aggs": {
"by_country": {
"terms": {
"field": "address.country" { ..., "aggregations" : {
} "by_country" : {
} "buckets" : [ {
} "key" : "England",
} "doc_count" : 30051
England }, {
Germany "key" : "Germany",
France "doc_count" : 30004
17% Spain }, {
33% "key" : "France",
17% "doc_count" : 15034
}, {
"key" : "Spain",
33% "doc_count" : 14912
} ]}}}
A lot more …
Elasticsearch Clients

• Java API
• Java REST Client
• JavaScript API
• Groovy API
• .Net API
• PHP API
• Perl API
• Python API
• Ruby API
• Community Contributed Clients: B4J, Clojure, Erlang, Go, Groovy, Haskell, Java,
JavaScript, kotlin, Lua, .Net, Ocaml, Perl, PHP, Python, R, Ruby, Rust, Scala, Smalltalk,
Vert.x
Kibana
Window into the Elastic Stack

Visualize and analyze Geospatial Customize and Share


Reports

Graph Exploration UX to secure and manage Build Custom Apps


the Elastic Stack
47
Become an Elastic Pioneer

1 Download 6.0 preview release

2 Provide feedback via GitHub or Discuss forum

3 Get limited edition Pioneer swag


Elastic Pioneer Program
We want your feedback!

1 Download 6.0 preview release (alpha, beta, etc)

2 Provide feedback via GitHub or Discuss forum

3 Get limited edition Pioneer swag


THANK YOU

@elastic
www.elastic.co

You might also like