Java 8 Stream API. A different way to process collections.
The document discusses the Java 8 Stream API, emphasizing its ability to process collections in a declarative manner using internal iteration. It explains the difference between collections and streams, detailing how streams offer features like concise syntax, parallel processing, and the concept of lazy evaluation. Additionally, it covers the use of lambdas, method references, and provides examples of stream operations and performance comparisons between regular and parallel streams.
A Stream is…
Anconvenience method to iterate over
collections in a declarative way
List<Integer> numbers = new ArrayList<Integer>();
for (int i= 0; i < 100 ; i++) {
numbers.add(i);
}
List<Integer> evenNumbers = new ArrayList<>();
for (int i : numbers) {
if (i % 2 == 0) {
evenNumbers.add(i);
}
}
@dgomezg
4.
A Stream is…
Anconvenience method to iterate over
collections in a declarative way
List<Integer> numbers = new ArrayList<Integer>();
for (int i= 0; i < 100 ; i++) {
numbers.add(i);
}
List<Integer> evenNumbers = numbers.stream()
.filter(n -> n % 2 == 0)
.collect(toList());
@dgomezg
5.
So… Streams arecollections?
Not Really
Collections Streams
Sequence of elements
Computed at construction
In-memory data structure
Sequence of elements
Computed at iteration
Traversable only Once
External Iteration Internal Iteration
Finite size Infinite size
@dgomezg
6.
Iterating a Collection
List<Integer>evenNumbers = new ArrayList<>();
for (int i : numbers) {
if (i % 2 == 0) {
evenNumbers.add(i);
}
}
External Iteration
- Use forEach or Iterator
- Very verbose
Parallelism by manually using Threads
- Concurrency is hard to be done right!
- Lots of contention and error-prone
- Thread-safety@dgomezg
7.
Iterating a Stream
List<Integer>evenNumbers = numbers.stream()
.filter(n -> n % 2 == 0)
.collect(toList());
Internal Iteration
- No manual Iterators handling
- Concise
- Fluent API: chain sequence processing
Elements computed only when needed
@dgomezg
8.
Iterating a Stream
List<Integer>evenNumbers = numbers.parallelStream()
.filter(n -> n % 2 == 0)
.collect(toList());
Easily Parallelism
- Concurrency is hard to be done right!
- Uses ForkJoin
- Process steps should be
- stateless
- independent
@dgomezg
Lambda Types
Based onabstract method signature from
@FunctionalInterface:
(Arguments) -> <return type>
@FunctionalInterface
public interface Predicate<T> {
boolean test(T t);
}
T -> boolean
@dgomezg
13.
Lambda Types
Based onabstract method signature from
@FunctionalInterface:
(Arguments) -> <return type>
@FunctionalInterface
public interface Runnable {
void run();
}
() -> void
@dgomezg
14.
Lambda Types
Based onabstract method signature from
@FunctionalInterface:
(Arguments) -> <return type>
@FunctionalInterface
public interface Supplier<T> {
T get();
}
() -> T
@dgomezg
15.
Lambda Types
Based onabstract method signature from
@FunctionalInterface:
(Arguments) -> <return type>
@FunctionalInterface
public interface BiFunction<T, U, R> {
R apply(T t, U t);
}
(T, U) -> R
@dgomezg
16.
Lambda Types
Based onabstract method signature from
@FunctionalInterface:
(Arguments) -> <return type>
@FunctionalInterface
public interface Comparator<T> {
int compare(T o1, T o2);
}
(T, T) -> int
@dgomezg
17.
Method References
Allows touse a method name as a lambda
Usually better readability
!
Syntax:
<TargetReference>::<MethodName>
!
TargetReference: Instance or Class
@dgomezg
Characteristics of AStream
• Interface to Sequence of elements
• Focused on processing (not on storage)
• Elements computed on demand
(or extracted from source)
• Can be traversed only once
• Internal iteration
• Parallel Support
• Could be Infinite
@dgomezg
21.
Anatomy of aStream
Source
Intermediate
Operations
filter
map
order
function
Final
operation
pipeline
@dgomezg
22.
Anatomy of StreamIteration
1. Start from the DataSource (Usually a
collection) and create the Stream
List<Integer> numbers =
Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
Stream<Integer> numbersStream = numbers.stream();
@dgomezg
23.
Anatomy of StreamIteration
2. Add a chain of intermediate Operations
(Stream Pipeline)
Stream<Integer> numbersStream = numbers.stream()
.filter(new Predicate<Integer>() {
@Override
public boolean test(Integer number) {
return number % 2 == 0;
}
})
!
.map(new Function<Integer, Integer>() {
@Override
public Integer apply(Integer number) {
return number * 2;
}
});
@dgomezg
24.
Anatomy of StreamIteration
2. Add a chain of intermediate Operations
(Stream Pipeline) - Better using lambdas
Stream<Integer> numbersStream = numbers.stream()
.filter(number -> number % 2 == 0)
.map(number -> number * 2);
@dgomezg
25.
Anatomy of StreamIteration
3. Close with a Terminal Operation
List<Integer> numbersStream = numbers.stream()
.filter(number -> number % 2 == 0)
.map(number -> number * 2)
.collect(Collectors.toList());
•The terminal operation triggers Stream Iteration
•Before that, nothing is computed.
•Depending on the terminal operation, the
stream could be fully traversed or not.
@dgomezg
Operation Types
Intermediate operations
•Always return a Stream
• Chain as many as needed (Pipeline)
• Guide processing of data
• Does not start processing
• Can be Stateless or Stateful
Terminal operations
• Can return an object, a collection, or void
• Start the pipeline process
• After its execution, the Stream can not be
revisited
28.
Intermediate Operations
// T-> boolean
Stream<T> filter(Predicate<? super T> predicate);
!
//T -> R
<R> Stream<R> map(Function<? super T, ? extends R> mapper);
//(T,T) -> int
Stream<T> sorted(Comparator<? super T> comparator);
Stream<T> sorted();
!
//T -> void
Stream<T> peek(Consumer<? super T> action);
!
Stream<T> distinct();
Stream<T> limit(long maxSize);
Stream<T> skip(long n);
@dgomezg
29.
Final Operations
Object[] toArray();
voidforEach(Consumer<? super T> action); //T -> void
<R, A> R collect(Collector<? super T, A, R> collector);
!
!
java.util.stream.Collectors.toList();
java.util.stream.Collectors.toSet();
java.util.stream.Collectors.toMap();
java.util.stream.Collectors.joining(CharSequence);
!
!
!
@dgomezg
30.
Final Operations (II)
//T,U-> R
Optional<T> reduce(BinaryOperator<T> accumulator);
//(T,T) -> int
Optional<T> min(Comparator<? super T> comparator);
//(T,T) -> int
Optional<T> max(Comparator<? super T> comparator);
long count();
!
@dgomezg
31.
Final Operations (yIII)
//T -> boolean
boolean anyMatch(Predicate<? super T> predicate);
boolean allMatch(Predicate<? super T> predicate);
boolean noneMatch(Predicate<? super T> predicate);
!
@dgomezg
32.
Usage examples -Context
public class Contact {
private final String name;
private final String city;
private final String phoneNumber;
private final LocalDate birth;
public int getAge() {
return Period.between(birth, LocalDate.now())
.getYears();
}
//Constructor and getters omitted
!
}
@dgomezg
33.
Usage examples -Context
public class PhoneCall {
private final Contact contact;
private final LocalDate time;
private final Duration duration;
!
//Constructor and getters omitted
}
Contact me = new Contact("dgomezg", "Madrid", "555 55 55 55", LocalDate.of(1975, Month.MARCH, 26));
Contact martin = new Contact("Martin", "Santiago", "666 66 66 66", LocalDate.of(1978, Month.JANUARY, 17));
Contact roberto = new Contact("Roberto", "Santiago", "111 11 11 11", LocalDate.of(1973, Month.MAY, 11));
Contact heinz = new Contact("Heinz", "Chania", "444 44 44 44", LocalDate.of(1972, Month.APRIL, 29));
Contact michael = new Contact("michael", "Munich", "222 22 22 22", LocalDate.of(1976, Month.DECEMBER, 8));
List<PhoneCall> phoneCallLog = Arrays.asList(
new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 28), Duration.ofSeconds(125)),
new PhoneCall(martin, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(5)),
new PhoneCall(roberto, LocalDate.of(2014, Month.MAY, 30), Duration.ofMinutes(12)),
new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 28), Duration.ofMinutes(3)),
new PhoneCall(michael, LocalDate.of(2014, Month.MAY, 29), Duration.ofSeconds(90)),
new PhoneCall(heinz, LocalDate.of(2014, Month.MAY, 30), Duration.ofSeconds(365)),
new PhoneCall(heinz, LocalDate.of(2014, Month.JUNE, 1), Duration.ofMinutes(7)),
new PhoneCall(martin, LocalDate.of(2014, Month.JUNE, 2), Duration.ofSeconds(315))
) ;
@dgomezg
34.
People I phonedin June
phoneCallLog.stream()
.filter(phoneCall ->
phoneCall.getTime().getMonth() == Month.JUNE)
.map(phoneCall -> phoneCall.getContact().getName())
.distinct()
.forEach(System.out::println);
!
@dgomezg
35.
Seconds I talkedin May
Long total = phoneCallLog.stream()
.filter(phoneCall ->
phoneCall.getTime().getMonth() == Month.MAY)
.map(PhoneCall::getDuration)
.collect(summingLong(Duration::getSeconds));
@dgomezg
36.
Seconds I talkedin May
Optional<Long> total = phoneCallLog.stream()
.filter(phoneCall ->
phoneCall.getTime().getMonth() == Month.MAY)
.map(PhoneCall::getDuration)
.reduce(Duration::plus);
total.ifPresent(duration ->
{System.out.println(duration.getSeconds());}
);
!
@dgomezg
37.
Did I phoneto Paris?
boolean phonedToParis = phoneCallLog.stream()
.anyMatch(phoneCall ->
"Paris".equals(phoneCall.getContact().getCity()))
!
!
@dgomezg
38.
Give me the3 longest phone calls
phoneCallLog.stream()
.filter(phoneCall ->
phoneCall.getTime().getMonth() == Month.MAY)
.sorted(comparing(PhoneCall::getDuration))
.limit(3)
.forEach(System.out::println);
@dgomezg
39.
Give me the3 shortest ones
phoneCallLog.stream()
.filter(phoneCall ->
phoneCall.getTime().getMonth() == Month.MAY)
.sorted(comparing(PhoneCall::getDuration).reversed())
.limit(3)
.forEach(System.out::println);
@dgomezg
Streams can becreated from
Collections
Directly from values
Generators (infinite Streams)
Resources (like files)
Stream ranges
@dgomezg
42.
From collections
use stream()
List<Integer>numbers = new ArrayList<>();
for (int i= 0; i < 10_000_000 ; i++) {
numbers.add((int)Math.round(Math.random()*100));
}
Stream<Integer> evenNumbers = numbers.stream();
or parallelStream()
Stream<Integer> evenNumbers = numbers.parallelStream();
@dgomezg
43.
Directly from Values& ranges
Stream.of("Using", "Stream", "API", "From", “Java8”);
can convert into parallelStream
Stream.of("Using", "Stream", "API", "From", “Java8”)
.parallel();
@dgomezg
44.
Generators - Functions
Stream<Integer>integers =
Stream.iterate(0, number -> number + 2);
This is an infinite Stream!,
will never be exhausted!
Stream fibonacci =
Stream.iterate(new int[]{0,1},
t -> new int[]{t[1],t[0]+t[1]});
fibonacci.limit(10)
.map(t -> t[0])
.forEach(System.out::println);
@dgomezg
45.
Generators - Functions
Stream<Integer>integers =
Stream.iterate(0, number -> number + 2);
This is an infinite Stream!,
will never be exhausted!
Stream fibonacci =
Stream.iterate(new int[]{0,1},
t -> new int[]{t[1],t[0]+t[1]});
fibonacci.limit(10)
.map(t -> t[0])
.forEach(System.out::println);
@dgomezg
46.
From Resources (Files)
Stream<String>fileContent =
Files.lines(Paths.get(“readme.txt”));
Files.lines(Paths.get(“readme.txt”))
.flatMap(line -> Arrays.stream(line.split(" ")))
.distinct()
.count());
!
Count all distinct words in a file
@dgomezg
Parallel Streams
use stream()
List<Integer>numbers = new ArrayList<>();
for (int i= 0; i < 10_000_000 ; i++) {
numbers.add((int)Math.round(Math.random()*100));
}
//This will use just a single thread
Stream<Integer> evenNumbers = numbers.stream();
or parallelStream()
//Automatically select the optimum number of threads
Stream<Integer> evenNumbers = numbers.parallelStream();
@dgomezg
49.
Let’s test it
usestream()
!
for (int i = 0; i < 100; i++) {
long start = System.currentTimeMillis();
List<Integer> even = numbers.stream()
.filter(n -> n % 2 == 0)
.sorted()
.collect(toList());
System.out.printf(
"%d elements computed in %5d msecs with %d threadsn”,
even.size(), System.currentTimeMillis() - start,
Thread.activeCount());
}
5001983 elements computed in 828 msecs with 2 threads
5001983 elements computed in 843 msecs with 2 threads
5001983 elements computed in 675 msecs with 2 threads
5001983 elements computed in 795 msecs with 2 threads
@dgomezg
50.
Let’s test it
usestream()
!
for (int i = 0; i < 100; i++) {
long start = System.currentTimeMillis();
List<Integer> even = numbers.parallelStream()
.filter(n -> n % 2 == 0)
.sorted()
.collect(toList());
System.out.printf(
"%d elements computed in %5d msecs with %d threadsn”,
even.size(), System.currentTimeMillis() - start,
Thread.activeCount());
}
4999299 elements computed in 225 msecs with 9 threads
4999299 elements computed in 230 msecs with 9 threads
4999299 elements computed in 250 msecs with 9 threads
@dgomezg
51.
Enough, for now,
Butthis is just the beginning
Thank You.
@dgomezg
dgomezg@gmail.com
www.adictosaltrabajlo.com