What is the answer to your question?
I've plotted here as count.
Yes, Kitty.
Katy.
The populations are different.
Yes, the sample sizes are different.
So how am I going to look at--
try and compare?
I can't set--
[INAUDIBLE]
My count is a count.
So it's like I happen to have a larger sample size in one
sample versus the other.
And I'm going to have much higher bar than--
one could do the division in one's head,
but that doesn't seem to make too much sense.
So the moment you want to compare two objects,
density makes--
using the proportion or the density
makes much more sense than using the--
So this is what it says.
It makes no sense to do it as count.
So let's do it as proportions.
Now, that's a little bit better.
They overlap.
I don't really like the aesthetic of it.
So I'm going to plot instead the points.
This is not the density estimation.
This is an histogram.
But where-- the points are represented as point, and then
they are drawn together.
We could do kernel--
I'm going to do the kernel in a minute.
I want to stop on this for a second
in order to learn some R thing, which is--
here I'm trying to now estimate--
actually I need to keep it because the grammar is not
ideal.
I've written it for kernel.
This is the same thing, but with the kernel distribution.
Oh, I thought I had made it nicer.
Doesn't matter.
So here I have--
I'm combining two graphs in one.
So in the same way that we can combine
two geometric functions in one plot,
I can also combine two data--
two graphs coming from different data into one plot.
I thought I had cleaned that this morning because then I
need to say, if I'm going to use data that comes from two
different data sets from--
in one single plot, I need to tell it every time what
is the data I'm using.
So inside-- so instead of putting
the data in the arguments here, I'm
going to put it inside here.
I need to put it for each of the geom density.
So it doesn't hurt to have it here because it's
overridden by the next line anyway, but it's unnecessary.
So the first argument here is unnecessary.
Another problem with this code is
that this is unnecessary-- either this is unnecessary
or these two are unnecessary.
So if you wanted to minimize the characters on your code,
you would not include this one, and you would not
include these two.
You would just include this one.
So you would say, hi ggplot.
I would like a histogram plotting the height.
My first one is going to be for the adults female data.
And please put in blue, and I'm going
to use a default bandwidth.
My second one is going to be the US data.
And I'm going to do it in red, and please use the default
bandwidth.
And then please label the thing and save.
Yes.
So just confirm, you could also take out
the data equal to Bihar adults, female--
I can put it from here, yes, because it's
called by the previous one.
I could remove it from here.
Yes, so I could move it.
Somehow it would not be very nice to look at because one--
it would be odd to have it here and not to--
it would be asymmetrical.
But this is also not very nice to look
at because it's called the preamble when
it's not really needed.
I thought I fixed it.
Evidently, I forgot.
So that's nice.
We have the two densities, the answer to your question.
And we can start to say these things about them.