Sunday, March 27, 2022

Splunk

 Splunk
Widely used application
For logs monitoring, visualization
and providing a solution for complete data center operation handling




Splunk captures, indexes, and correlates real-time data
in searchable repository from which it can generate graphs,
reports, alerts, dashboards, and visualizations.
Splunk is a horizontal technology used for application
management, security and compliance, as well as business
and web analytics.

What is splunk?
A application platform to search, analyze and visualize
the machine-generated data gathered from the different system
like applications on premise, cloud virtual machine.

On premise server or cloud server, endpoint application or
any device which generate log. which helps to manage
the application monitoring and operate the application.




Kafka




 

Apache Kafka decouple your data stream and your systems
Source system will send the data in Kafaka.
Target system will get data from the kafaka system.

what we have in kafka
any data stream we think about
e.g website events
user interaction, pricing data, financial projection
once your data there in kafka
you can put in any system you like
database
analytics
email system
audit

why apache kafka?
created by linkedin, open source
distributed, resilient architecture, fault tolerant
Horizontal scalability
 can scale to 100s of brokers
 can scale to millions of messages per second
High performance (latency of less than 10ms) - real time
Used by the 2000+ firms, 35% of the fortune 500:

Use cases
Messaging system
activity tracking
gather metrics from many different locations
application logs gathering
stream processing(with the kafka streams API or Spark for example)
De-coupling of sytem dependencies
Integration with spark,flink,storm,hadoop, and many other big data technologies

Example
Neflix uses kafka to apply recommendations in real-time while your watching tv shows

Uber uses kafka to gather user, taxi and trip data in real-time to compute and forecast demand,
and compute surge pricing in real-time

LinkdIn uses kafka to prevent spam, collect user interactions to make better
connection recommendations in real time.

Note : kafka is only used as a transportation mechanism.


AWS Service Catalog

 AWS Service Catalog

 

AWS Service Catalog allows IT administrators to create,
manage, and distribute catalogs of approved products to end users,
who can then access the products they need in a personalized portal.
Administrators can control which users have access to each product
to enforce compliance with organizational business policies.

AWS Catalog Steps:
1. Create Portfolios
2. Create Product
In Product - use template like EC2 instance creation, VPC creation etc.
3. Attach Product in Portfolios
Add group and role - group and role/User must have - AWSServiceCatalogEndUserFullAccess

4. Now go to that user login, and - go to Products - Launch Product.
To view already Launched Products
Go to Provisioned Products
 

 







 

Thursday, March 17, 2022

Prometheus

Prometheus

 

https://www.youtube.com/watch?v=KY0Kzmm_928


What is Prometheus?
Primarily used for metrics monitoring
It use PromQL.
Prometheus is a pull based metric monitoring system which need the location of endpoint.
endpoint need to exposed.  

Monitoring Using Premetheus
Monitoring Kubernetes cluster
Query time series data to generate graphs, tables
create alerts
Open source

Where does Prometheus Live?
Port 9090

Node exporter
Port 9100

 

Downloading and running Prometheus
https://prometheus.io/download/#prometheus

wget https://github.com/prometheus/prometheus/releases/download/v2.34.0/prometheus-2.34.0.linux-amd64.tar.gz

https://prometheus.io/docs/prometheus/latest/getting_started/


Configuring Prometheus to monitor itself
check config in below yml file
prometheus.yml

Starting Prometheus
./prometheus --config.file=prometheus.yml

Node exporter
Targets means node exporter
or any other exporter so we use node exporter


https://prometheus.io/docs/guides/node-exporter/

wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz

Starting Prometheus
./node_exporter

After starting Node Exporter,
Need to inform on prometheus about Node exporter
and ask to collect data.

How to configure role?
If you want to see granular data,
like daywise or weekly, you can check this data with query also.
But when you run query, it process all data in realtime, then it show result.
so here process run for this task can slow your server etc.

so if you already aware you gonna see weekly or daily particular data.
you can create a rule for that, and rule will only store the processed
data result in prometheus.

Configure rules for aggregating scraped data into new time series

groups:
- name: cpu-node
  rules:
  - record: job_instance_mode:node_cpu_seconds:avg_rate5m
    expr: avg by (job, instance, mode) (rate(node_cpu_seconds_total[5m]))

in record - (we are giving name of metric)
in expr - we have query

go to prometheus folder, create 1 file
file name - cpunode.yml
pass above yml code

go to prometheus.yml file
give path under rule_files:
rule_files:
   - "cpunode.yml"


now restart prometheus

Now if we run above record - job_instance_mode:node_cpu_seconds:avg_rate5m
it will give us directly result

How to connect Prometheus with Grafana?

Prometheus in hindi

https://www.youtube.com/watch?v=UbfpughYouw&t=69s

Prometheus - its data store - it can store any kind of data -
data like - cpu, apps, disk space -
if not given any database -
it will store the data - /var/lib/prometheus
it will install and store data on above location
Port - 9090

How give data to Prometheus
2 below agents
E.S.(elastic search)
Cadvisor
both task to bring data - give to prometheus
so prometheus filter the data by itself
and start showing on prometheus
 like cpu data
 health check data

so promethus keep data on its end in this way
data is kind of query data

But how you its visible?
That will be done by grafana.

Grafana basically read that data and give visual

Exporter
data can come 2 way
1. Local data
2. Exporter - it will collect data from server A and B
app data, jenkins app metric. and put in datastore

Export type
Http, Node
Ever application have node-exporter
Node -exporter - go and write data to the prometheus.

How brings data in Prometheus ?

Node exporter


https://www.youtube.com/watch?v=1fiq2yPQhXs&list=PLdsu0umqbb8NxUs8r8BIUe9-PhcoZyojA&index=17

Kibana

Kibana


 


 
 

 

Logstash - its collect logs and transform
Logstash take as input whole log file, and created index passed as output to elasticsearch
logstash work on file, database both. e.g system logs, app logs, or any particular analysis

 Elasticsearch -
its analysis the logs send by logstash, and do indexing
elasticsearch work for searching and indexing,
its store index in memory, in key value form
so when every you type key(keyword), it will show its value.

Kibana -
its visialize the logs provided by Elasticsearch.

In logstash we have many plugin, that take as input and provide output for like system logs, app logs.

Same way we can read database also, eg. we have banking application, and we want to see which client got logged in today, what and how many transfer they have done, so process is same - collect data from logstash - dump to elasticsearch - visualize to kibana.

we have 10 server, we have issue on 1 server
From System logs, application logs, we can identify, we can narrow down the problem, where is issue came, and we can resolve it

If any session or any page taking so much time to load, we can identify after put the logs on the dashboard, and we can see what response taking so much time, as response we can check

In case we got any exception or error issue in system, we can take that exception and error as string, we can create as alert, and it should trigger when its occur and action its email to system admin etc.

CPU memory usage or any probability of any server get down, we can set threshold, and we can trigger as alert.

All 3 Logstash, ElasticSearch, Kibana can work individual also,   

 


logstash plugin

For logstash, you need plugins for different kind of data input, log input
e.g. logstash-input-jdbc, logstash-input-s3
logstash-output-s3, logstash-output-http

how to install logstash plugin
logstash-plugin install logstash-input-file

how to configure logstash?
go to config and open config file

where we put data in elasticsearch?
elasticsearch - data - node -

taking input from beat and putting output in elasticsearch
input {
  beats {
   port => 5044
 }
}

output {
 elasticsearch {
  hosts => ["http://localhost:9200"]
  index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{+YYYY.MM.dd}"
  #user => "elastic"
  #password => "changeme"
 }
}
--------------------------------------------------------------------

Below is file input plugin, and output to elasticsearch

 -----------------------------------------------------------------------------------

jdbc input 


 ------------------------------------------------------------------------------------------

Latest Ver
 - 7.12

is not a datastore
Kibana should configure to run against an Elasticsearch node of the same version. This is the officially supported configuration.
Best suited for logs Analysis.


Interface    
Web based interface
by default runs on port 5601

 

Dashboard & Visualization
Best suited to analyze logs
supports test based Analysis
Offers different visualizations capabilites


Supported Datasources
Supports only Elastic Search
LDAP Integration

Querying
Use Elastic search Query Language to query data


Alerting
Graphite, Prometheus, InfluxDB, MySQL, PostgreSQL, and Elasticsearch

Community
Has Excellent Community
https://www.elastic.co/community/


Elastic Search

Search Engine -  Like Google, when we enter keyword. it will give result.
Free & Open Source -
Full Text Search - like manjeet written in any column, it can search,
bt in RDMS, its difficult like in mysql.

Scalable
 Horizontal - on single systme increase config.
 Vertical - create parallel server.
Inverted Index - Like index in end of book, like in mcgraw hill book, word with page info.
Schema Free - it create schema according to database, not like RDBMS
Json -  ELK created in java, but we can use diff. api for comm. like for data.

ELK is full fledged application.
ELK based on Lucene (its Library for search)

Apache Lucene
Search Library(IR)
Free
Open Source
Java
Inverted Index

Diff. ELK and Lucene, Lucene have some limitation, like if you need to use on so many nodes, then its difficult.

Terminology
Cluster - Multiple machine combine, its called cluster
Node - single instance.
Index
Document - its like row in mysql
Field - Column in mysql
mapping - Schema in mysql

When you have distributed form
Shard - we have replica copy of data on multiple server, one in write mode, and others are in replicating mode.
Primary Shard - the server where it writing its primary
Replica Shard - other then writing, like other are replica.

Elastic Vs Mysql
Index = Table
Mapping = Schema
Field = Column
Document = Row

Use case why use ELK

If you have 100 of server, and you need to find out 1 app logs, its difficult,

Also if you have restriction to login to application server and logs is there, so you need to provide some kind of decentralize why of logs to other people

ELK Installation
go to ELK website download

 

Elastic Search installation
go to ELK website download
after download
go to bin folder of ELK
and run ./elasticsearch
default port is for ELK 9200

Something for Kibana
go to bin folder of kibana
and run ./kibana
default port is for kibana 5601

 https://www.youtube.com/watch?v=nsJar753ROc&list=PLTgwj-KL1pO2I0EQu8lDbhoH1CpLIHg9d

ELK
E - elasticsearc
L - Log stash - limitation - 100 query support
K - Kibana

fluentd (EFK) f - fluentd
1000 query
time space query 4-6 pm
micro service query data

ELK / EFK stack

EFK is currently famous

In case you working on container you must know about filebeat/MatricBeat
and you can say you bring logs data using filebeat/matricbeat and seeing it using Kibana

How brings data in EFK?

Fluentd its kind of agent, you can install on any client machine.

Part 1 ELK

https://www.youtube.com/watch?v=JrqdVGzSe8U&t=3s

Part 2 ELK

https://www.youtube.com/watch?v=TAgPoAJsv8Q


https://www.youtube.com/watch?v=nsJar753ROc&list=PLTgwj-KL1pO2I0EQu8lDbhoH1CpLIHg9d

Wednesday, March 16, 2022

Grafana


Grafana






Query
Visualize
Alert

Grafana Open source
Support all major databases - in same dashboard
Dynamic dashboard & Filters
Explore Metrics and logs
Alerting

Version - 7.4.2
Grafana - is visualization tool, it doesn't store in it,
we need 1 data store for it

Default Port 3000

 
 
 


Telegraf - Collect System Data - its database that collect system related data. we will put it on server where collect data like system Metrics, like cpu utliz., memory utilz, etc

InfluxDB - Store System Data(with timestamp) telemetry database - it will basically collect data that send by telegraf

Grafana - Visualizes Data, and data collected in InfluxDB, we can create beautiful database using Grafana,

Steps to download Grafana
go to https://Grafana.com/grafana/download

Install Influx DB
https://influxdata.com/influxdb/v1.8/introduction/install




Telegraf

https://influxdata.com/influxdb/v1.17/introduction/install


 

Configure Telegraf
/etc/telegraf/telegraf.conf
uncomment
urls = ["http://127.0.0.1:8086"]

restart telegraf service

for explore Grafana dashboard feature visit below link

 https://www.youtube.com/watch?v=E6Me2slK6zk

grafana full course

https://www.youtube.com/watch?v=CMvOekuOvSo

 

If you have so many dashboards you can search from Search bar

+ - from plus Can create new dashboard

Import you can import already created dashboard.


Panel

Part of dashboard, a dashboard is collection of multiple panel.




Dashboard Settings
Here you can add your variables,
Dashboard setting its only work for your current dashboard
If you have 5 different dashboard,
then you can do 5 different setting for each dashboard.


 
cycle view mode - hide un-hide side bar from the panel
filter key option - time frame 
 
  

zoom-in zoom-out
Refresh time

 Folder - In folder we can put different dashboard according to use


import

if you want to import dashboard, can use import dashboard.


Folder - In folder we can put different dashboard according to use

Explore - It will use when you use diff data sources and query.

Alerting - When create new alert, like cpu util more then 80%
you want alert on telegram and email, we can use diff
Notification channels



 
 
 
Server admin -> Stats - we can see here, all user admin stats
 

Current User login


Preferences from main user change for all user, Preferences from user, only chane for that particular user.


InfluxDB will store all data, and we connect grafana with influxDB,

influxDB integrate with grafana
Settings - Configuration - Data Source


To check data reached in InfluxDB
go to server
type influx - influx cli will open
> show databases;
name: databases
telegraf
> use telegraf;
> show measurements;
cpu
disk
diskio
kernel
mem
processes
swap
system
>

its telemetry database, it will keep data with timestamp
Telegraf is agent on client machine who sending data to influxDB database.

>select * from cpu limit 5;
 
Connect data source to Grafana
Go to web Ui
settings - Configuration - Add data source
choose your data source from the list



In HTTP we can define where is our Data source(influxDB).

Database - telegraf

Save & Test


Create Panel
Go to Panel
in Panel setting you have data Source option if you have more then 1 data source you can select even multiple data source also.
after Database selection
Query Build
In grafana we can create multiple query
From - default (selection option like telegraf) - use default only
Select measurement - (seen in cli like cpu, disk,kernel etc)
select field - (usage_system, usage_idle, usage_nice etc)

create panel for CPU util.


same way now create for mem. util.


 create uptime dashboard


convert unit into sec.


unit - into sec

decimal - 0

Thresholds  - red - 80 - base - green

Create dashboard for user logged in.

From - system 

Select - rt_users

visualization - stat

 

create  panel

From - mem

Select - used_percent

Thresholds

Red - 80

Base - green

if want to add more threashold click on add threshold

 max put before 100 and add value and color for the threshold.


 
Create Total Process
 From - processes
Select - total   
Stat

Create Total Thread
From - processes
Select - total threads
 
Create Disk Util.
From - disk
Select - Used_percent
Group_by - click + tag(path)
disk mean(path:/)
disk mean(path:/boot/efi)
Panel type - Bar gauge
Alias by $tag_path         - it will remove tag_path from the panel.
 



Now Save the dashboard - give name - if want to check if dashboard available - go to - Home - Recent view dashboard - click on saved dashboard

Multi server Monitor
Install telegraf on other server also.
Create Panel   
From - cpu
Select - usage_system

it will show combin value if not selected group by
so select group by
Group by - + click - tag(host) 

Alias by - $tag(host) - if you want to remove tag host from the panel


Convert to Row - if you want to hide current all panel in your dashboard, you can convert to row and you can hide and unhide from the dashboard.


If you want to Rename Title - Click Row Option - Title - give name like Summary - Update

In multiple host - cpu util. 

Values - Min , max, avg, current, total ,  we can also select  -from option - can select - As table


 Legend we can put on right side also


But suppose you have 100 machine,
and you want to monitor only 2-3 in all.



Go to dashboard setting
Go to variables - Add variable
Create Query

How to see all host on influxDB
$ influx
> use telegraf;
> show tag values from cpu with key=host  - press enter

Create Query

Name - Servers

Label - Select A server(s)

Query - show tag values from cpu with key=host

After you select query the value in  - Preview of values automatically came.



Now go back to dashboard - you get a drop down of server list.



If you want to edit - select multiple server

Edit variable - Selection Options - 

Multi-value -Enable

 

If you want to provide All option in drop down

Edit variable - Selection Options - 

Include All option -Enable

 

Table panel - for display data

From - processes
Select - field(Total)
group By - time(1m) tag(host)
Formate as - Table

Limit - 10


In Panel - field - go to - cell display mode - Color text

Thresholds
80 - red
base - greeen