Profiling Spark Applications: The Easy Way

by Michael (noreply@blogger.com) at June 25, 2017 07:20 AM


Recently, I thought about some one-click way to profile Spark applications, so it could be easily integrated in any work environment without the need to configure the system.


The modern way to report profile statistics about an application (any application, not just Spark or Java application) is generating a single .SVG file called "flame graph". Since this is a regular vector graphic format, the file can be opened in any browser. Moreover, you can navigate between different stack frames by clicking on them, and even search for a symbol name by clicking on "Search" link.


This is how sample flame graph looks like:
The y-axis shows the stack depth while the x-axis shows time spent in a stack frame.

In order to generate flame graphs, there are two mandatory processes usually:
  • Capture stack traces from a running process, and dump them to disk. 
  • Parse these stack traces, and generate .SVG file. 
For Java based applications it stack traces can be gathered using commercial features of Oracle JDK (using -XX:+FlightRecorder option). There's an article that explains how to profile Spark applications using this option.

In OpenJDK this feature is not available, but luckily there are other options. Once of them is using statsd JVM profiler library from Etsy. This library integrates as agent into JVM, gathers statistics like CPU or memory usage, and send it to statsd server in real time. Apparently, this library supports reporting to InfluxDB as well .

Keeping the above in mind, the whole process will look like this:
  1. Have InfluxDB running on random port.
  2. Start Spark with the statsd profiler Jar in its classpath and with the configuration that tells it to report statistics back to the InfluxDB instance.
  3. After running Spark application, query all the reported metrics from the InfluxDB instance.
  4. Run a script that generates the target report.
  5. Stop the InfluxDB instance.
  6. Store generated .SVG file somewhere, or send it to someone.
The following script is a wrapper to ‘spark-submit’ command, which does all that:
#!/bin/bash

set -e
trap 'kill $(jobs -p) 2>/dev/null' EXIT

function find_unused_port() {
for port in $(seq $1 65000); do
echo -ne "\035" | telnet 127.0.0.1 $port >/dev/null 2>&1;
if [ $? -eq 1 ]; then
echo $port
exit
fi
done
echo "ERROR: Can't find unused port in range $1-65000"
exit 1
}

function install_deps() {
for cmd in python2.7 perl pip; do
if ! which $cmd >/dev/null 2>&1; then
echo "ERROR: $cmd is not installed!"
exit 1
fi
done

echo -e "[$(date +%FT%T)] Installing dependencies"
[ ! -d $install_dir ] && mkdir $install_dir
pushd $install_dir >/dev/null
pip -q install --user influxdb blist

wget -qc https://github.com/etsy/statsd-jvm-profiler/releases/download/2.1.0/statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar
ln -sf statsd-jvm-profiler-2.1.0-jar-with-dependencies.jar statsd-jvm-profiler.jar

wget -qc https://raw.githubusercontent.com/aviemzur/statsd-jvm-profiler/master/visualization/influxdb_dump.py
wget -qc https://raw.githubusercontent.com/brendangregg/FlameGraph/master/flamegraph.pl

wget -qc https://dl.influxdata.com/influxdb/releases/influxdb-1.2.4_linux_amd64.tar.gz
tar -xzf influxdb-1.2.4_linux_amd64.tar.gz
ln -sf influxdb-1.2.4-1 influxdb
popd >/dev/null
}

function run_influxdb() {
echo -e "[$(date +%FT%T)] Starting InfluxDB"
cat << EOF >influxdb.conf
reporting-disabled = true
hostname = "${local_ip}"
bind-address = ":${influx_meta_port}"
[meta]
dir = "$(pwd)/influxdb/meta"
[data]
dir = "$(pwd)/influxdb/data"
wal-dir = "$(pwd)/influxdb/wal"
[admin]
enabled = false
[http]
bind-address = ":${influx_http_port}"
EOF
rm -rf influxdb
$install_dir/influxdb/usr/bin/influxd -config influxdb.conf >influxdb.log 2>&1 &

wait_secs=5
while [ $wait_secs -gt 0 ]; do
if curl -sS -i $influx_uri/ping 2>/dev/null | grep X-Influxdb-Version >/dev/null; then
break
fi
sleep 1
wait_secs=$(($wait_secs-1))
done

if [ $wait_secs -eq 0 ]; then
echo "ERROR: Couldn't start InfluxDB!"
exit 1
fi

curl -sS -X POST $influx_uri/query --data-urlencode "q=CREATE DATABASE profiler" >/dev/null
curl -sS -X POST $influx_uri/query --data-urlencode "q=CREATE USER profiler WITH PASSWORD 'profiler' WITH ALL PRIVILEGES" >/dev/null
}

function run_spark_submit() {
spark_args=()
jars=$install_dir/statsd-jvm-profiler.jar
while [[ $# > 0 ]]; do
case "$1" in
--jars) jars="$jars,$2"
shift
;;
*) spark_args+=("$1")
[[ "$1" == *.jar ]] && flamegraph_title="$1"
;;
esac
shift
done

spark_cmd=(spark-submit)
spark_cmd+=(--jars)
spark_cmd+=("$jars")
spark_cmd+=(--conf)
spark_cmd+=("spark.executor.extraJavaOptions=-javaagent:statsd-jvm-profiler.jar=server=${local_ip},port=${influx_http_port},reporter=InfluxDBReporter,database=profiler,username=profiler,password=profiler,prefix=sparkapp,tagMapping=spark")
spark_cmd+=(${spark_args[@]})

echo -e "[$(date +%FT%T)] Executing: ${spark_cmd[@]}"
"${spark_cmd[@]}"
}

function generate_flamegraph() {
rm -rf stack_traces
python2.7 $install_dir/influxdb_dump.py -o $local_ip -r $influx_http_port -u profiler -p profiler -d profiler -t spark -e sparkapp -x stack_traces
perl $install_dir/flamegraph.pl --title "$flamegraph_title" stack_traces/all_*.txt > flamegraph.svg
rm -rf stack_traces
echo -e "[$(date +%FT%T)] Created flamegraph: $(pwd)/flamegraph.svg"
}

local_ip=$(ip route get 8.8.8.8 | awk '{print $NF; exit}')
install_dir=$HOME/.spark-flamegraph
influx_meta_port=$(find_unused_port 48080)
influx_http_port=$(find_unused_port $(($influx_meta_port+1)))
influx_uri=http://${local_ip}:${influx_http_port}
flamegraph_title="Spark Application"

install_deps
run_influxdb
run_spark_submit "$@"
generate_flamegraph

The script is also available on this gist.


For the sake of justice, it should be noted that the following utilities must present on your system prior to running the script: perl, python2.7 and pip. Otherwise, the script was used in Amazon EMR environment without any issues. Just use the script instead of usual spark-submit command, and it will profile your application. and create a report:

[hadoop@ip-10-121-4-244 tmp]$ ./spark-submit-flamegraph --name 'etlite' --jars file://$(pwd)/probe-events-1.0.jar etlite_2.11-0.1.0.jar s3://mobility-artifacts/airflow/latest/config/etlite.conf
[2017-06-05T12:34:05] Installing dependencies
[2017-06-05T12:34:09] Starting InfluxDB
[2017-06-05T12:34:10] Executing: spark-submit --jars /home/hadoop/.spark-flamegraph/statsd-jvm-profiler.jar,file:///tmp/probe-events-1.0.jar --conf spark.executor.extraJavaOptions=-javaagent:statsd-jvm-profiler.jar=server=10.121.4.244,port=48081,reporter=InfluxDBReporter,database=profiler,username=profiler,password=profiler,prefix=sparkapp,tagMapping=spark --name etlite etlite_2.11-0.1.0.jar s3://mobility-artifacts/airflow/latest/config/etlite.conf
17/06/05 12:34:11 INFO Main$: Configuration file = 's3://mobility-artifacts/airflow/latest/config/etlite.conf'
17/06/05 12:34:14 INFO S3NativeFileSystem: Opening 's3://mobility-artifacts/airflow/latest/config/etlite.conf' for reading
17/06/05 12:34:15 INFO SparkContext: Running Spark version 2.1.0

... running Spark application ...

17/06/05 12:35:17 INFO SparkContext: Successfully stopped SparkContext
17/06/05 12:35:17 INFO ShutdownHookManager: Shutdown hook called
17/06/05 12:35:17 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-fa12133c-b605-4a73-814a-2dfd4ed6fdde

... generating .svg file ...

[2017-06-05T12:35:25] Created flamegraph: /tmp/flamegraph.svg


Integrating this script into Airflow Spark operator is straightforward, especially if your Spark operator is derived from BashOperator. Just make sure the script is available on all Spark Airflow workers, then do the replacement of spark-submit command depending on whether profile=True is passed as the operator argument.

Post your weird flame graphs in comments! :)

by Michael (noreply@blogger.com) at June 25, 2017 07:20 AM

Asciidoctor: watch you build log

June 24, 2017 10:00 PM

Last thursday at Voxxed Days Luxembourg I had the opportunity to speak about asciidoctor in my talk (Documentation as code: contrôler la qualité !). Voxxed Days conferences are similar to Devoxx, but smaller (only one day). It was the second edition of Voxxed Days Luxembourg and the conference is really great (perfect organization, pleasant ambience, nice people and interesting conversations).

Voxxed Days Luxembourg

Back to my talk, I want to provide more details about a point I have presented: how to monitor your asciidoctor build logs with jenkins.

During the build, asciidoctor tells you when something is unexpected. Let me give you a real example: In the eclipse scout documentation, you have a lot of code snippets with callouts to mark certain lines.

Eclipse Scout documentation extract
Figure 1. Eclipse scout documentation extract

In order to do this, you need to define the callout in your source code and add the explanation after the code snippet as presented in Listing 1.

Listing 1. Callout example
[source,adoc]
.Initial implementation of class OrganizationTablePage.
----
(..)
    return TEXTS.get("Organizations"); // <1>
(..)
----
<1> Make sure to add a translated text entry for "Organizations" using the Scout NLS tooling

If there is a mismatch between the two elements, you will get a warning in your logs:

[INFO]
[INFO] --- asciidoctor-maven-plugin:1.5.5:process-asciidoc (book_scout_intro-to-html) @ scout_beginners_guide ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] ignoreDelta true
[INFO] Copying 0 resource
asciidoctor: WARNING: _TutorialStep2.adoc: line 247: no callouts refer to list item 1 beginners_guide/src/docs/beginners-guide.adoc
[INFO]

If you are using Jenkins as continuous integration server, the warnings plugin helps you to find those lines in your log. It also keeps a record of them, in order to track the evolution over the time.

Jenkins Job

Here is how you can configure the plugin in the admin view in order to detect the asciidoctor lines:

Jenkins Admin configuration
Listing 2. Regular expression:
asciidoctor: (WARNING|ERROR): ([^:]+): line ([0-9]+): (.*)
Listing 3. Mapping Script:
import hudson.plugins.warnings.parser.Warning

String category = matcher.group(1)
String fileName = matcher.group(2)
String lineNumber = matcher.group(3)
String message = matcher.group(4)

return new Warning(fileName, Integer.parseInt(lineNumber), "Dynamic Parser", category, message);

Then in your build definition you need to add a post-build step:

Jenkins Job configuration

Now you are informed when something goes wrong in your documentation. By the way there is the idea that Asciidoctor could produce a report containing all the warnings and errors that are discovered during the build. They could be collected in a xml or json file. For the moment issue #44 is still open.

By the way my slides are online on SlideShare.

PS: I have already proposed a talk for EclipseCon Europe 2017. I hope I will get a slot to be able to present more aspects of the "documentation as code" pattern.


June 24, 2017 10:00 PM

PolarSys Capella Industry Consortium (IC) at Eclipse PolarSys

June 20, 2017 08:00 AM

PolarSys Capella IC to host the Capella Ecosystem stakeholders in a vendor neutral way, organized by an open governance model.

June 20, 2017 08:00 AM

Woohoo! Java 9 has a REPL! Getting Started with JShell and Eclipse January

by Jonah Graham at June 19, 2017 08:16 PM

With Java 9 just around the corner, we explore one of its most exciting new features – the Java 9 REPL (Read-Eval-Print Loop). This REPL is called JShell and it’s a great addition to the Java platform. Here’s why.

With JShell you can easily try out new features and quickly check the behaviour of a section of code. You don’t have to create a long-winded dummy main or JUnit test – simply type away.  To demonstrate the versatility of JShell, I am going to use it in conjunction with the Eclipse January package for data structures. Eclipse January is a set of libraries for handling numerical data in Java, think of it as a ‘numpy for Java’.

Install JShell

JShell is part of Java 9, currently available in an Early Access version from Oracle and other sources. Download and install Java 9 JDK from http://jdk.java.net/9/ or, if you have it available on your platform, you can install with your package manager (e.g. sudo apt-get install openjdk-9-jdk).

Start a terminal and run JShell:capture1

As you can see, JShell allows you to type normal Java statements, leave off semi-colons, run expressions, access expressions from previous outputs, and achieve many other short-cuts. (You can exit JShell with Ctrl-D.)

Using JShell with Eclipse January

To use Eclipse January, you need to:

1. Download January:

Get the January 2.0.2 jar ( or older version January 2.0.1 jar).

2. Download the dependency jars:

The January dependencies are available from Eclipse Orbit, they are:

3. Run JShell again, but add to the classpath all the jars you downloaded (remember to be the in the directory you downloaded the jars to):

Windows:

"c:\Program Files\Java\jdk-9\bin\jshell.exe"  --class-path org.eclipse.january_2.0.2.v201706051401.jar;org.apache.commons.lang_2.6.0.v201404270220.jar;org.apache.commons.math3_3.5.0.v20160301-1110.jar;org.slf4j.api_1.7.10.v20170428-1633.jar;org.slf4j.binding.nop_1.7.10.v20160301-1109.jar

Linux:

jshell --class-path org.eclipse.january_2.0.2.v201706051401.jar:org.apache.commons.lang_2.6.0.v201404270220.jar:org.apache.commons.math3_3.5.0.v20160301-1110.jar:org.slf4j.api_1.7.10.v20170428-1633.jar:org.slf4j.binding.nop_1.7.10.v20160301-1109.jar

Some notes:
Some version of jshell the command line argument is called -classpath instead of --class-path
If you are using git bash as your shell on Windows, add winpty before calling jshell and use colons to separate the path elements.

capture2

Then you can run through the different types of January commands. Note JShell supports completions using the ‘Tab’ key. Also use /! to rerun the last command.

Import classes

Start by importing the needed classes:

import org.eclipse.january.dataset.*

(No need for semi-colons and you can use the normally ill-advised * import)

Array Creation

Eclipse January supports straightforward creation of arrays. Let’s say we want to create a 2-dimensional array with the following data:

[1.0, 2.0, 3.0,
 4.0, 5.0, 6.0,
 7.0, 8.0, 9.0]

First we can create a new dataset:

Dataset dataset = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 })
System.out.println(dataset.toString(true))

This gives us a 1-dimensional array with 9 elements, as shown below:

[1.0000000, 2.0000000, 3.0000000, 4.0000000, 5.0000000, 6.0000000, 7.0000000, 8.0000000, 9.0000000]

We then need to reshape it to be a 3×3 array:

dataset = dataset.reshape(3, 3)
System.out.println(dataset.toString(true))

The reshaped dataset:

 [[1.0000000, 2.0000000, 3.0000000],
 [4.0000000, 5.0000000, 6.0000000],
 [7.0000000, 8.0000000, 9.0000000]]

Or we can do it all in just one step:

Dataset another = DatasetFactory.createFromObject(new double[] { 1, 1, 2, 3, 5, 8, 13, 21, 34 }).reshape(3, 3)
System.out.println(another.toString(true))

Another dataset:

 [[1.0000000, 1.0000000, 2.0000000],
 [3.0000000, 5.0000000, 8.0000000],
 [13.000000, 21.000000, 34.000000]]

There are methods for obtaining the shape and number of dimensions of datasets

dataset.getShape()
dataset.getRank()

Which gives us:

jshell> dataset.getShape()
$8 ==> int[2] { 3, 3 }

jshell> dataset.getRank()
$9 ==> 2

Datasets also provide functionality for ranges and a random function that all allow easy creation of arrays:

Dataset dataset = DatasetFactory.createRange(15, Dataset.INT32).reshape(3, 5)
System.out.println(dataset.toString(true))

[[0, 1, 2, 3, 4],
 [5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14]]


import org.eclipse.january.dataset.Random //specify Random class (see this is why star imports are normally bad)
Dataset another = Random.rand(new int[]{3,5})
System.out.println(another.toString(true))

[[0.27243843, 0.69695728, 0.20951172, 0.13238926, 0.82180144],
 [0.56326222, 0.94307839, 0.43225034, 0.69251040, 0.22602319],
 [0.79244049, 0.15865358, 0.64611131, 0.71647195, 0.043613393]]

Array Operations

The org.eclipse.january.dataset.Maths provides rich functionality for operating on the Dataset classes. For instance, here’s how you could add 2 Dataset arrays:

Dataset add = Maths.add(dataset, another)
System.out.println(add.toString(true))

Or you could do it as an inplace addition. The example below creates a new 3×3 array and then adds 100 to each element of the array.

Dataset inplace = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 }).reshape(3, 3)
inplace.iadd(100)
System.out.println(inplace.toString(true))

[[101.0000000, 102.0000000, 103.0000000],
 [104.0000000, 105.0000000, 106.0000000],
 [107.0000000, 108.0000000, 109.0000000]]

Slicing

Datasets simplify extracting portions of the data, known as ‘slices’. For instance, given the array below, let’s say we want to extract the data 2, 3, 5 and 6.

[1, 2, 3,
 4, 5, 6,
 7, 8, 9]

This data resides in the first and second rows and the second and third columns. For slicing, indices for rows and columns are zero-based. A basic slice consists of a start and stop index, where the start index is inclusive and the stop index is exclusive. An optional increment may also be specified. So this example would be expressed as:

Dataset dataset = DatasetFactory.createFromObject(new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 }).reshape(3, 3)
System.out.println(dataset.toString(true))
Dataset slice = dataset.getSlice(new Slice(0, 2), new Slice(1, 3))
System.out.println(slice.toString(true))

slice of dataset:

[[2.0000000, 3.0000000],
 [5.0000000, 6.0000000]]

Slicing and array manipulation functionality is particularly valuable when dealing with 3-dimensional or n-dimensional data.

Wrap-Up

For more on Eclipse January see the following examples and give them a go in JShell:

  • NumPy Examples shows how common NumPy constructs map to Eclipse Datasets.
  • Slicing Examples demonstrates slicing, including how to slice a small amount of data out of a dataset too large to fit in memory all at once.
  • Error Examples demonstrates applying an error to datasets.
  • Iteration Examples demonstrates a few ways to iterate through your datasets.
  • Lazy Examples demonstrates how to use datasets which are not entirely loaded in memory.

Eclipse January is a ‘numpy for Java’ but until now users have not really been able to play around with it in the same way you would numpy in Python.

JShell provides a great way to test drive libraries like Eclipse January. There are a couple of features that would be nice-to-have such as a magic variable for the last result (maybe $_ or $!) and maybe a shorter way to print a result (maybe /p :-). But overall, it is great to have and finally gives Java the REPL and ability to be used interactively that many have gotten so used to with other programming languages.

In fact we will be making good use of JShell for the Eclipse January workshop being held at EclipseCon France, see details and register here:  https://www.eclipsecon.org/france2017/session/eclipse-january

eclipseConV2



by Jonah Graham at June 19, 2017 08:16 PM

Getting Started with Jekyll (on Windows!)

by maggierobb at June 16, 2017 11:31 AM

 This week Kichwa Coders’ intern Jean Philippe found out the hard way that when it comes to building websites, having the right tools for the job is vital to success. Follow his progress as he explores the potential of using Jekyll to build a user-friendly, easy to maintain static website on Windows.

What is Jeykll and why do we use it?

Jekyll uses Markdown – a text-to-HTML conversion tool – to create a a blog-aware static website that doesn’t require a huge amount of maintenance. Once you have created the structure you just have to add your own Markdown file and Jekyll will add it to the website. The appeal of Jeykll for many users is that it allows content editors to edit the site without knowing how to code. After some rudimentary experience I can now create a basic Jeykll website.

How easy is it to get started with Jeykll?

This week I built my first website using Jekyll. I had some initial difficulties understanding how to use it, but once I’d got the basics I was able to come up with ideas on how to get the best out of it. Before you can install Jekyll you need to install Ruby and Bundle. I’m on Windows, so at first it was hard to install Jekyll as it is more suited to Linux, Linux users are most familiar by using command line and it’s easier to install Ruby and Bundle on Linux but I found this website.  However when I attempted to build a new project with the command “Jeykll new newproject” I got this:image

This wasn’t what I was expecting and I was just wondering what to do next when a colleague pointed out that this was a problem with the version of Jekyll that I had on my laptop – and that the project showing on my screen was in fact correct – phew! I went back to some useful tutorials and discovered that I needed to create some specific folders and add the initial group of folders to them. I called these new folders  ‘_includes’ and  ‘_layouts’.  I then had to create some files to build the structure of the website.

Having got my basic website up and running I wanted to build a new project within it, but again I ran into difficulties almost immediately. I assumed I had made a mistake earlier on in the process as I did not appear to have the correct architecture to build my project with. But then it became clear that there are two ways to install Jekyll. The first is with the instruction “sudo apt-get install jeykll”; and the second using Ruby is “gem install jeykll”. The difference between these two instructions is crucial as the first installs an older system of Jeykll and the second installs the latest version. Most of the help and tuition online is for the older version of Jekyll and this was what was confusing me as I was working on a newer version. Once I had installed the software on a linux and given the command “sudo apt-get install jekyll” I was able to proceed.

My first attempt at a Jeykll website is now online and looks like this:

image

image

image

image

 

It’s still a work in progress but I’m really pleased with the results so far. Using online tutorials and help forums etc is a great way to learn your way around something new- but don’t forget to check your sources are relevant to the software you’re using.

What’s Next?

I’m hoping to create a website with different topics.  The home page will reproduce the style of the existing January website, but in addition I will include a “Docs” page where  you can read all the posts so far about January project. At the moment that’s mainly the work completed by my colleagues at Kichwa Coders, but in the future it could be anyone with something to say about January. I hope it will be grow to be a useful resource.

The best aspect of Jekyll is that when I  finish the website, I can add a post in markdown and add it to the website easily. I recommend using Jekyll if you want a quick and easy way to add posts, and if you don’t want to write any HTML.  you just have to build the structure and then if you want to add some post you just have to put markdown files in.

 

 



by maggierobb at June 16, 2017 11:31 AM

I’m being supported!

by tevirselrahc at June 09, 2017 11:15 AM

There was allready an effort under way, named Team Cyperus, to provide commercial support for me, as it was something that industries told my minions was really needed.

Well, it did not take too long for someone elseto see the benefit in supporting me!

Go see their press release about this acquisition!

 


Filed under: community, Uncategorized Tagged: commercial, services, support

by tevirselrahc at June 09, 2017 11:15 AM

Already sold out? – Eclipse Democamp Oxygen 2017, June 28th 2017

by Maximilian Koegel and Jonas Helming at June 09, 2017 07:13 AM

Every year it is amazing to see how fast the Eclipse DemoCamp in Munich sells out. To give everyone a fair chance to register, we have announced the time the registration opens in advance. This year, it took less than 1 week, until the available 120 seats were fully booked. However, even if you did not get a seat, we still encourage you to register for the waiting list. You can register here. There you’ll also find detailed information on the location, agenda, time and more. The democamp is in 3 weeks (June 28th 2017) and some of the 120 registrants will be forced to cancel. We remind all current registrants to double check their availability. We expect some people to unregister before the event. If and only if you are on the waiting list, you will be notified immediately, once a seat becomes available. We wish you good luck and we are looking forward to great demos and seeing you in June!

A big thanks to our sponsors: BSI Business Systems Integration AG, EclipseSource München GmbH and Eclipse Foundation.


by Maximilian Koegel and Jonas Helming at June 09, 2017 07:13 AM

Preview of a guide for Java developers

by jponge at June 09, 2017 12:00 AM

I could not attend the last Eclipse Vert.x community face-to-face meeting last fall, but one item that was discussed is the need for guides aimed at certain types of developers. One of my missions as part of joining the team was to work on this and I’m very happy to share it with you today!

A gentle guide to asynchronous programming with Eclipse Vert.x for enterprise application developers

The guide is called “A gentle guide to asynchronous programming with Eclipse Vert.x for enterprise application developers” and it is an introduction to asynchronous programming with Vert.x, primarily aimed at developers familiar with mainstream non-asynchronous web development frameworks and libraries (e.g., Java EE, Spring).

Quoting the introduction:

We will start from a wiki web application backed by a relational database and server-side rendering of pages; then we will evolve the application through several steps until it becomes a modern single-page application with “real-time” web features. Along the way you will learn to:

  1. Design a web application with server-side rendering of pages through templates, and using a relational database for persisting data.
  2. Cleanly isolate each technical component as a reusable event processing unit called a verticle.
  3. Extract Vert.x services for facilitating the design of verticles that communicate with each other seamlessly both within the same JVM process or among distributed nodes in a cluster.
  4. Testing code with asynchronous operations.
  5. Integrating with third-party services exposing a HTTP/JSON web API.
  6. Exposing a HTTP/JSON web API.
  7. Securing and controlling access using HTTPS, user authentication for web browser sessions and JWT tokens for third-party client applications.
  8. Refactoring some code to use reactive programming with the popular RxJava library and its Vert.x integration.
  9. Client-side programming of a single-page application with AngularJS.
  10. Real-time web programming using the unified Vert.x event bus integration over SockJS.

The guide takes a gradual approach by starting with a “quick and dirty” solution, then refactoring it properly, exposing the core Vert.x concepts, adding features, and moving from callbacks to RxJava.

We need your feedback!

The code is available at https://github.com/vert-x3/vertx-guide-for-java-devs. You can report feedback as Github issues to that repository and even offer pull-requests.

You can check it out from GitHub (the AsciiDoc is being rendered fine from the repository interface) or you can check out pre-rendered HTML and PDF versions that I am temporarily sharing and keeping up-to-date from my Dropbox: https://www.dropbox.com/sh/ni9znfkzlkl3q12/AABn-OCi1CZfgbTzOU0jYQpJa?dl=0

Many thanks to Thomas Segismont and Julien Viet who contributed some parts, and also to the people who reviewed it privately.

As usual, we welcome your feedback!


by jponge at June 09, 2017 12:00 AM

MDSD (and ME!) in Robotics

by tevirselrahc at June 08, 2017 08:25 PM

One of my minions (from Queens’ University) pointed out an interesting document titled “Robotics 2020 Multi-Annual Roadmap.” from Spark -The Partnership for Robotics in Europe.

It is a very interesting read, especially for those, like me, who are involved in the Eclipse PolarSys Rover project and the Papyrus Industry Consortium!

I especially like the following blurbs:

From printed page 248 (go to page 258 in the PDF):

“Model based methods are needed at the core of all complex robot systems and through
the lifecycle. To address increasing complexity, a shift from human-oriented document-driven
approaches to computer-assisted tools and a computer processable model-driven approach is
needed in order to gain from design support processes”

And from printed page 251 (go to page 261 in the PDF):

“Model-driven software development and domain specific languages are core technologies
required in order to achieve a separation of roles in the robotics domain while also improving
composability, system integration, and also addressing non-functional properties”

Many of these aspects are are already part of work that has been one on me.

This is exactly where I can and will make a difference!

Shoutout to my minion GD for making me aware of this!


Filed under: DSML, Papyrus, Papyrus IC, Papyrus-RT, PolarSysRover, RobotML, Uncategorized

by tevirselrahc at June 08, 2017 08:25 PM

LiClipse 4.0 released

by Fabio Zadrozny (noreply@blogger.com) at June 08, 2017 06:35 PM

LiClipse 4.0 is now available for download!

This is the first release based on Eclipse Oxygen (built on 4.7 RC3). It's still not a final release, but very close to it, so, make sure you take a look at https://www.eclipse.org/eclipse/news/4.7/platform.php for the latest news in the platform (the gem for me is now being able to hide the status bar -- personally, I'm now using Eclipse without the toolbar nor status bar -- really nice for a minimalist theme).

There was a critical fix for users on Mac OS which resulted in LiClipse not working properly after an update. Note that for Mac users which are using LiClipse 3.x, a fresh install is needed (follow the instructions from http://www.liclipse.com/download.html#updating_native).

The other major changes in this release are actually in PyDev, which featured a fix which prevented the code-coverage from working properly and now supports code coverage when testing with pytest. Also, IronPython is supported again in the debugger (it was previously broken because IronPython didn't support sys._current_frames (although an important note is that IronPython 2.7.6 and 2.7.7 don't work with PyDev because of a critical issue in IronPython, so, either keep to IronPython 2.7.5 or use the development version).

There were also fixes in the PyLint integration, updating docstrings, finding __init__ on code-completion when it's resolved to a superclass, etc... See: http://www.pydev.org for more details.

Enjoy!

by Fabio Zadrozny (noreply@blogger.com) at June 08, 2017 06:35 PM

Last Chance to Register for EclipseCon France 2017

June 08, 2017 02:45 PM

Sign up for learning at EclipseCon France, June 21-22. XText, Capella, Science, Docker, LSP and much more!

June 08, 2017 02:45 PM

sprotty – A Web-based Diagramming Framework

by Jan Köhnlein at June 08, 2017 01:54 PM

Development tools in the web are trending. With Theia, we have already started to build an IDE platform with web technologies that works for browser apps as well as rich clients. While Xtext, Monaco, and LSP constitute a good foundation for textual editing, the question arises whether we can extend this idea for graphics. So we started brooding over a graphical framework as well and here is the result: Let me present you to sprotty.

sprotty is a web-based framework for diagrams. It is open-source under the Apache 2 license. The source code is on Github.

Rendering and Animations

sprotty uses SVG, and by that offers modern, stable and scalable rendering on a wide range of browsers

In sprotty, animations have been built-in right from the beginning, so the framework is prepared for asynchronous state changes everywhere. Animations help the user of a sprotty diagram to keep the context without being distracted by flickering. We already ship a set of pre-built transitions for morphing diagrams on state changes, but you can easily build your own ones. You can even travel back and forth in time using animations.

sprotty also comes with a bridge to the Eclipse Layout Kernel for sophisticated automatic diagram layout.

Separation of Client and Server

A sprotty app usually consists of two major components: The client only holds a model of the current diagram, renders it, and interacts with the user. The optional server knows about an underlying semantic model and how to map it to diagram elements. Client and server communicate by sending each other JSON notifications. This minimizes the memory footprint on the client, which is necessary for a browser app. A server can handle much bigger amounts of data, e.g. from a database or a development workspace. Having said that, sprotty can of course be used a client-only app without a backend.

Integration With LSP, Xtext and Theia

While sprotty is not necessarily tied to the LSP, its architecture is a good match. We have integrated it with Theia by extending an Xtext-based language server with the sprotty server, tunneling the messages through the LSP, and creating a Theia widget holding the sprotty client. The source code is on Github as well. The result can be seen here:

Reactive Architecture with Dependency Injection

sprotty’s is inspired by modern reactive frameworks like React/Flux. Information flows in a unidirectional circle between three functional components. This architecture is much less susceptible to event feedback cycles than the traditional model-view-controller approach. As the components don’t rely on a shared state, they can be unit-tested individually without the need to set up the entire environment or to start a browser.

In the viewer component, sprotty uses a very fast virtual DOM framework that patches the actual DOM with the changes. Using so called thunks you can skip entire branches of the DOM if they are unchanged to further optimize performance.

sprotty’s client is implemented in TypeScript, so you can enjoy the benefits of static typing if you wish or use JavaScript directly if not. The SVG views can be easily specified using JSX and styled with CSS.

All components of a diagram are wired up with dependency injection. This way, users can customize every single part, while getting good defaults without much ado.

The sprotty server is written in Xtend, which is transpiled to Java as TypeScript is to JavaScript. Integration with Xtext and a language server is easy.

Current State

We have just started sprotty but it should already be applicable for many scenarios. We plan further extensions, e.g. to allow the user assemble the content of a diagram and persist the result. We expect to get quite some momentum in the near future in combination with the Theia project.

Give it a try and let us know what you think. sprotty works best with the Chrome browser. Issue reports are welcome. Like in Theia, the CDT team at Ericsson has started contributing to sprotty, and so could you!


by Jan Köhnlein at June 08, 2017 01:54 PM

Untested Code is like Schrödinger’s Cat – Dead or Alive?

by maggierobb at June 08, 2017 08:50 AM

catinbox2

If every line of untested code is like Schrödinger’s cat – Potentially dead or alive – how important is it to ‘open the box’ properly and know for sure if the code will leap out and run?

The perceived wisdom that if a piece of code hasn’t been tested you can assume it won’t work, is proof – if any were needed – that coders will always expect the worst case scenario when creating code. Unlike Schrödinger, a coder will not waste time mulling over the metaphysical possibilities of whether their code might be dead or alive or even dead AND alive at the same time – they need certainty, and as quickly as possible. However any amount of testing will only be worthwhile if the quality of that testing is high.   In this blog Yannick Mayeur, a Kichwa Coders intern, describes how he kept his fur on whilst improving the testing coverage of Eclipse January.

An introduction to JUnit

This week I was reintroduced to JUnit, having forgotten most of what I had learned about it at the University Institute of Technology back home. JUnit is a unit testing framework. It is used to test the different methods of a program to see whether or not the intended behaviour is working. It is often said that a method that is not tested is a method full of bugs, and after a week of testing  I can confirm that this saying is indeed grounded in truth.

My job this week was to improve the test coverage of Eclipse January. You can calculate the coverage of a program using the EclEmma plug-in. I worked on the DatasetUtils class, improving the coverage from 47% to almost 58%, and fixing bugs using two methods:  (https://github.com/eclipse/january/pull/178 and https://github.com/eclipse/january/pull/188).

Seeing that bugs can exist in untested code written by people that know a lot more about what they are doing than I do, really showed me the importance of testing.

How I did it

This is a test I have written for the method “crossings”. Writing this test helped me highlight some unexpected behaviour in the way it works.

@Test
public void testCrossings3() {
	Dataset yAxis = DatasetFactory.createFromObject(new Double[] {
			0.5, 1.1, 0.9, 1.5 });
	Dataset xAxis = DatasetFactory.createFromObject(new Double[] {
			1.0, 2.0, 3.0, 4.0 });
	List<Double> expected = new ArrayList<Double>();
	expected.add(2.5);
	List<Double> actual = DatasetUtils.crossings(xAxis, yAxis, 1,
			0.5);
	assertEquals(expected, actual);
}

This shows what the values look like: behaviour

The expected behaviour of the method as written in the test would be that the 3 crossing points would be merged into one at 2.5, but this wasn’t what was happening, indeed the code was using “>” instead of “>=”. If left untested this code’s bug would probably never have been discovered.

Conclusion

Discovering bugs like this one is crucial. When users employ this method they are almost certainly expecting the same behaviour that I was, and therefore won’t understand why their code isn’t working – especially if they can’t see the original code of the method and only have access to its Javadoc. I hope that correcting bugs like this one will create a smoother user experience for coders in the future.

 

 

 

 



by maggierobb at June 08, 2017 08:50 AM

Improving Eclipse CDT Indexer Performance

by Patrick Könemann (patrick.koenemann@itemis.com) at May 31, 2017 01:20 PM

Many of our customers have chosen Eclipse CDT as their tooling for developing C/C++ projects and they frequently complain about the poor runtime of the CDT indexer. I would claim that we (Java developers) are quite spoiled about the excellent performance of the Java tooling in Eclipse.

 Indexing a large Java workspace at startup may take several seconds, for really large projects maybe a few minutes, but afterwards it rarely blocks the sophisticated Java tooling. Be it large code imports or refactorings, Eclipse Java tooling handles that very well. But when developing a large C project, indexing may take several minutes, maybe even half an hour or more, depending on the available memory and CPU of course. This was the trigger for us to put some effort into analyzing the hotspots of the CDT indexer with the goal to improve performance. 

To make a long story short, an additional cache of internal string handling may speed up indexing time of certain projects by nearly 40%!

Factors that Influence Indexer Performance

First of all, let’s investigate the different factors that influence the indexer performance:

  • Java heap size that is available to the indexer for a cache:
    If the internally used memory for building and managing the index is too low, parts of the index is stored on the hard drive and we all know that writing data to and reading it from a hard drive is much more expensive than accessing it from main memory.
    The CDT indexer has a preference option to allow it to consume up to 50% of the available heap size for an internal cache.
  • Structure of the C/C++ source files:
    If the sources are organized in a flat hierarchy, i.e. with few includes, and if it does not contain thousands of macro definitions, the indexer is much faster than sources with a high degree of includes and with plenty of macro definitions.
  • Hardware resources:
    The faster the CPU / hard drive / memory, the faster the indexer – obviously ;-)

We don’t have much influence on the latter two factors, but at least we can configure Eclipse and the CDT indexer to have enough memory available (e.g. 1 GB). But still, performance is far from acceptable (some customers even reported times of 45 minutes and longer).

So we took one of our customer’s projects, an AUTOSAR project, which had an indexing time of slightly more than 10 minutes on my machine, and used the YourKit Java Profiler to search for some hotspots:

  1. observation: during 1 minute, the indexer was creating more than 4 billion (!) objects of type ShortString and LongString.
  2. observation: most calculation time is spent for comparing large char arrays (e.g. length > 70,000) which are represented by long strings.

To tackle these two issues, we first need to understand the actual problem:

  • Why are so many objects created (and garbage collected just as fast)?
  • What are these long strings?
  • Why is it necessary to compare them?
  • How can we improve string handling and comparison?

Code Structure Matters!

When looking into these long strings that are frequently compared, it turns out that they consist of plenty of macro definitions concatenated in alphabetical order. It seems that they are calculated for particular locations in the source files for fast lookup. (To be honest, I did not (yet) fully understand how the indexer is working in detail, e.g. why it needs to store so many long but very similar strings – so please correct me if the conclusions to my observations are not correct.) We found the reason for these incredible long list of macro definitions inside the code structure which is sketched below:

 

improve-Eclipse-CDT-Indexer-Performance-macro-definitions-inside-code-structure.png 

There is one large header file which contains more than 1,000 macro definitions (MemMap.h) and there are many other header files that include this file – even multiple times. New macro definitions and includes to MemMap.h do alternate, resulting in different sets of defined macros at each location. These macro definitions are stored by the indexer inside a data structure that is basically a super large char array. For adding and searching entries (e.g. the set of defined macros at a particular location), a binary tree data structure is used on top of that super large char array. Whenever a candidate string should be added or searched inside this binary tree, a visitor is used to compare existing entries with that candidate. The comparison must decide whether the candidate entry is smaller or larger than existing entries, which requires the strings to be compared character by character. For each entry to be compared, a new ShortString or LongString object is created which is a wrapper of the actual string with a java.lang.String-like API for comparison.

First Attempt: Reducing Object Creation

Our first guess was that the immense number of Short- and LongString objects may slow down the overall performance, so we reduced that as far as possible:

  • Whenever a new string is stored inside the database, most of the time its location in the data structure (an index of the super large char array) is of interest – so we directly returned that index instead of creating (and immediately disposing) a Short-/LongString object.
  • Whenever some particular string is requested via such an index (e.g. the represented char array or a java.lang.String instance), we directly retrieved that data instead of creating (and immediately disposing) a Short-/LongString object.
  • We only left the Short-/LongString objects when they are needed for comparisons.

Although there are much less objects created, the indexer run time was not affected at all. It seems as if the Java compiler and the VM do quite some optimizations so that the overhead of creating wrapper objects does not really affect performance.

Second Attempt: Optimizing Comparison of Char Arrays

Whenever two LongString instances are compared (i.e. determining their lexicographical order), ShortString.compareChars(..) is called. This looks odd. For understanding how and why this could be optimized, we must first understand the difference between a ShortString and a LongString.

The database (the super large char array) is divided into chunks of 4096 chars. Whenever a string should be stored inside the database, one or more chunks must be allocated. If the string to be stored is larger than the available space inside one chunk, it must be split over multiple chunks.

  • A ShortString is shorter than the size of a chunk, so it fits into a single chunk.
  • A LongString is larger than the size of a single chunk, so it must be split and distributed over multiple chunks.

A pointer to the beginning of the stored string is enough to represent the location and to retrieve it again when needed (the last entry in a chunk is a pointer to the next chunk).

The costly comparison in our AUTOSAR project compares strings with length > 70,000, so it compares strings that are stored in ⌈70,000 / 4,096⌉ = 18 chunks. The current implementation loads the entire char array of full length for both strings to be compared, and compares them character by character, even if they already differ in a character contained in one of the first chunks. Our idea is a new compare logic for long strings, LongString.compare(..), which loads and compares characters chunk by chunk to decide their lexicographical order. This implementation is listed in bug 515856.

With this optimization, indexer performance is in fact increased for our AUTOSAR project: instead of 680 seconds, the indexer finished in 640 seconds. This is an improvement of roughly 6% with no additional memory consumption.

Third Attempt: Caching Char Arrays

The previous two attempts already explained that Short-/LongString instances are always newly created for each access of strings stored inside the super large char array. Moreover, each access on the actual string (represented by a char array stored in one or more chunks), (re-)retrieves the desired char array from the chunks, e.g. for comparing it against another string. The indexer has already a cache built-in which holds recently accessed chunks in memory until the indexer cache size limit is hit. If a requested chunk is not inside the cache, it must be loaded from the super large char array. However, the actual char array for comparison is still reconstructed inside the Short-/LongString classes on each access.

Instead of caching chunks, we tried to cache the actual Short-/LongString objects. Moreover, instead of having a fixed cache size, we use Java references to let the garbage collector free some memory when needed. So whenever a new Short-/LongString is retrieved from the database and constructed from its chunks, it is stored by its long pointer in a map of references. Another access of the same string first checks whether the string is still contained in that map. If so, we can directly return it; if not, we need to re-construct the string from the chunks and re-add it into the map. It was a bit tricky to let the garbage collector free the memory for both, the string value and the key (the long pointer), but with a negligible overhead, it is possible. This implementation is listed in bug 514708. 

The resulting indexer performance in our AUTOSAR project looks promising: instead of 680 seconds, the indexer finished in 430 seconds. This is an improvement of roughly 37%!

Improved performance is typically a tradeoff between runtime and memory consumption. The above-mentioned patch uses weak references which are garbage collected as soon as heap size becomes sparse. Consequently, the newly introduced cache takes additional memory if available and the garbage collector takes care of cleaning it up when needed.

Benchmarking

To quantify the performance improvements, we ran some benchmark tests for the latter two attempts. We tried several variants on the string cache to reduce memory consumption with the cost of slightly more code complexity. As test projects, we used the already mentioned AUTOSAR project as well as the php open source project, which has a totally different source code structure. Each project was indexed between 11 and 20 times with an indexer cache between 32 and 1024 MB (which did not noticeably affect performance). The results are shown in the table below. We also tried several other open source projects but failed to produce reasonable benchmark results: either the indexing time differed by just 1-2 seconds for a total indexing time between 15 and 45 seconds (samba, vlc, ffmpeg), or the indexer aborted because of too many errors (linux kernel, gcc). All results are listed here.

 

Project

AUTOSAR [20 runs]
Average runtime / Min / Max

PHP [11 runs]
Average runtime / Min / Max

Original runtime in seconds
(speedup in percent)

576
(0%)

545
(0%)

667
(0%)

54
(0%)

51
(0%)

59
(0%)

LongString.compare (Second Attempt)

509
(11.6%)

485
(18.1%)

571
(6.2%)

55
(-1.2%)

50
(-11.1%)

60
(5.7%)

Weak (1) String Cache (Third Attempt)

[ see explanation for note (1) below ]

370
(35.8%)

342
(38%)

419
(31.6%)

56
(3%)

51
(-20.8%)

64
(11.9%)

Soft (1) String Cache (Third Attempt)

366
(36.5%)

337
(38.8%)

410
(31.1%)

54
(0.9%)

49
(-9.8%)

61
(10.2%)

LongString.compare & Weak String Cache

441
(23.5%)

422
(29.1%)

503
(19.6%)

56
(-3.7%)

53
(-11.1%)

60
(5.1%)

LongString.compare & Soft String Cache

389
(32.4%)

347
(37.2%)

432
(29.2%)

54
(0.3%)

49
(-9.4%)

59
(16.9%)

Weak String Cache with disposing keys (2)

371
(35.6%)

342
(37.6%)

421
(31.9%)

55
(-2.1%)

51
(-11.8%)

59
(11.9%)

Soft String Cache with disposing keys

365
(36.7%)

336
(39.3%)

406
(34.7%)

56
(-3.5%)

51
(-9.8%)

61
(13.6%)

Weak String Cache with hard cache (3)
(100 strings)

380
(34.1%)

338
(38%)

449
(26%)

56
(-2.5%)

52
(-11.3%)

59
(10.2%)

Soft String Cache with hard cache
(100 strings)

374
(35.1%)

337
(38.7%)

442
(28.5%)

56
(-3.8%)

51
(-13.2%)

60
(13.6%)

Weak String Cache with hard cache
(1000 strings)

370
(35.9%)

340
(38.3%)

420
(33.7%)

55
(-0.9%)

49
(-13.2%)

63
(11.9%)

Soft String Cache with hard cache
(1000 strings)

369
(36%)

335
(39.1%)

413
(32.2%)

55
(-1.8%)

51
(-9.4%)

61
(10.2%)

 

The first row is the original indexer run time. The subsequent rows show the result of different adjustments of the second and third attempt with several combinations and adjustments of the string cache. Unfortunately, the benchmark results of the php project vary a lot so that there is no clear improvement for any of the listed attempts. There seems to be some other factor that makes the indexer runtime vary. At least the different attempts do not observably slow down the indexer runtime.

As you can see in the highlighted cells for the AUTOSAR project, the string cache may speed up indexing time by up to 39%! The LongString-specific compare implementation, on the other hand, only results in a speedup of 6-18%. Quite interesting is that a combination of the second and third attempt performs worse than the third attempt alone, but I don’t have any reasonable explanation for this.

Some further remarks to the variants we used for benchmarking:

(1) Ethan Nicholas says that “softly reachable objects are generally retained as long as memory is in plentiful supply. This makes them an excellent foundation for a cache”. The results confirm this citation.

(2) With a ReferenceQueue, the disposal of strings also disposes their keys in the map, so the garbage collector is capable of entirely wiping the string cache if necessary. Fortunately, this overhead does not have significant impact on performance – it even performs better than without key disposal.

(3) If memory gets sparse, weak and soft references are disposed by the garbage collector. But what if the most recently used strings would remain in memory anyhow? We used a hard cache to test this setting. Unfortunately, this does not yield any better performance. So we can omit this overhead.

Some words about additional memory consumption by the cache keys:

The life cycle of the string cache is bound to the life cycle of the indexer for a particular project. So the entire cache is discarded as soon as the indexer is discarded for a project. During the indexing task, however, the cache will be filled with keys (long pointers) to the cached strings which are wrapped in weak or soft references. So whenever the heap size becomes sparse, the garbage collector may dispose the cached strings (which happened between 2 and 90 times in the benchmarks above). When the indexer was done with the AUTOSAR project, the map had around 60.000 keys and only 2.000 cached strings, the rest was garbage collected. Without disposing the keys, memory consumption for these keys is 60.000 * 8 Byte (long value) = 480 KB, which is less than 1/500 of the default indexer cache size. IMHO, this temporary extra memory is more than tolerable for the achieved performance boost. Fortunately, the extra logic for disposing keys via a ReferenceQueue does not affect the performance at all, so it does not hurt to clean that up, too :-)

By the way, the modifications with the best results are provided in bug 514708.

Join the Discussion!

For code with less includes and less macro definitions, the exact same problems may not arise, but maybe the string lookup inside the binary tree is still the bottleneck. Please leave some comments whether you have similar or maybe other observations. Do you have further ideas or experiences how to improve performance?

 


by Patrick Könemann (patrick.koenemann@itemis.com) at May 31, 2017 01:20 PM

Red Hat to Acquire Codenvy to Extend DevOps Tools Capability

by Helen Beal at May 31, 2017 10:00 AM

Red Hat has announced the acquisition of Codenvy, an Agile and cloud-native tools provider. Financial terms of the deal are not being publicly disclosed.

By Helen Beal

by Helen Beal at May 31, 2017 10:00 AM

New Eclipse IoT Charter and Steering Committee

by Ian Skerrett at May 25, 2017 05:48 PM

It is hard to believe the Eclipse IoT Working Group was launch over 5 years ago on November 1, 2011, at the time we called it Eclipse M2M.  A lot has changed over these 5 years, including the name, and the IoT industry has matured to be one of the dominant technology trends in the technology industry. The good news is the Eclipse IoT Working Group has been a huge success. We have a thriving open source community that includes 30 different projects, more than 200 developers and 30 member companies. Eclipse IoT is well known and positioned in the industry and continues to see momentum and growth.

Given this community growth, we felt it was time to take a fresh look at the Eclipse IoT Working Group charter and the Steering Committee. After a number of drafts and revisions, we have updated and published the new working group charter.  Most of the changes were done to reflect the current focus on IoT runtimes and frameworks and adding more clarity to the roles and responsibilities of the Steering Committee.

Now that the new charter as been approved, I am thrilled to have Red Hat, Bosch and Eurotech volunteer to participate in the Eclipse IoT Steering Committee. All three companies are active leaders in the Eclipse IoT community and the general IoT industry. They each bring a unique perspective on IoT and open source to our community:

  • Bosch is a world leading industrial company that is considered a leader in providing industrial IoT solutions. Their commitment and involvement in the Eclipse IoT community is evident by their involvement in projects like Eclipse Leshan, Eclipse Hono, Eclipse Vorto, Eclipse Hawkbit, and Eclipse Ditto.
  • Eurotech is well known industrial gateway vendor that was one of the founding members of Eclipse M2M. They have experience incredible success with Eclipse Kura and are on the path to success with Eclipse Kapua.
  • Red Hat has deep roots in open source and enterprise IT. In the last 2 years they have become deeply involved in projects like Kapua, Hono and others. They have also been instrumental in helping launch our Eclipse IoT Open Testbeds.  Red Hat understands that for IoT to be successful it needs to integrate OT and IT. They are on the path to being a leader in this space.

The next 2-3 years are going to be very exciting for the IoT industry and in particular the Eclipse IoT community. We have the technology and individuals that are making a difference and delivering real and valuable technology for IoT solution developers. It is very exciting to have these 3 companies help lead the way to our continued success and momentum.

 



by Ian Skerrett at May 25, 2017 05:48 PM

The State of Android and Eclipse

by kingargyle at May 25, 2017 12:55 PM

So it has been a while since I posted an update of where things stand.  Honestly, not a lot has changed on the eclipse front.  We still don’t have built in AAR support, there is no integrated Gradle support between Buildship and Andmore, we are behind on Nougat support, and O is fast approaching release status.   With that said, there have been a few bug fixes contributed by the community, which were released as a maintenance release, but the larger corporate adoption… is pretty much non-existent from a contribution stand point.

The later I’m not sure how to fix, as I whole heartily believe that Andmore needs a corporate backer and sponsor to fund things.   This could probably be avoided if several people that really have an itch to fix the base wanted to scratch it.   There is a lot to work on that it can be a bit overwhelming to figure out where to start.

When Google had announced a couple years ago that Android Development Tools would not be maintained any longer, I had that itch to see if I could at least get it and the Moto Dev Studio tools to a place where it would have a chance to survive.  I managed to scratch that and over the next year with the help of several other committers bring you Andmore 0.5.0.   Believe me the most difficult work has been done, and that is getting everything through the IP process, what is there is clean from that stand point.

So the general question now that Andmore is at the Eclipse Foundation, is there a dire need from the community to have Android tooling?   There was much outrage and yelling when Google made the decision to move to Intellij, some of it rightfully deserved some of it just background noise.   Regardless of the reasons, Android Tooling was always a Google controlled and sponsored project, it was open source in really name only (it took nearly 2 years for Doug’s CDT integration to be integrated into the core).   The same is happening with Android Studio, it is controlled and dominated by Google and is really open source in name only.  If Google decides for whatever reason that they want to move all development to the cloud and abandon Android Studio and Intellij, there is nothing that anyone will be able to do to prevent it.  The difference though is, that Jet Brains will pickup and continue developing Android support themselves.  Why, because they have a financial interest to make sure their IDE supports it for their corporate customers.

This brings us back to the question… Does the community really want Android tooling built off the eclipse platform?   If so, how do we improve it, given that right now, this is largely a volunteer effort.

 



by kingargyle at May 25, 2017 12:55 PM

The Containerization of Dev Environments

by Doug Schaefer at May 24, 2017 03:38 PM

As a veteran tools architect working for a veteran embedded systems platform vendor, we’re getting pretty good at building cross development environments. You get all the speed and integration with other native tools that today’s rich host platforms can provide. Combine that that with a good installer and software maintenance tool, and it’s super simple for users to get setup and keep their installs updated with the latest fixes and patches. It’s worked for many years.

So of course, I was a bit taken back with recent talk about delivering development environments in containers and distributing them to users for use with cloud IDEs. The claim is that the installation process is simpler. But I have to ask, while yes it is simpler for the provider, is it also simpler for the user?

I work with embedded software engineers. Their systems are complex and the last thing they want to do is fight with their tools. That doesn’t pay the bills. And that’s why we work so hard to make that management simpler. And if you don’t have the experience creating cross development environments, it is certainly appealing to only have to worry about one host platform, 64-bit Linux, as you do with Docker which, BTW, just happens to be the easiest to support especially relative to Windows.

But do I really have to teach my embedded developer customers about Docker? How to clean up images as updates are delivered? How to start and stop containers, and in the case of Windows and Mac, the VMs that run those containers? And that’s not to mention cloud environments which are a whole new level requiring server management, especially as the developer community scales. Embedded development tools require a lot of horsepower. How many users can a server actually support and how do customers support the burstiness of demand?

So, while I get it, and as vendors take this path and as users do get used to it, I do need to be prepared to support such environments. I’ll just feel a bit sad that we are giving up on providing our users the great experiences that native cross development tools provide.


by Doug Schaefer at May 24, 2017 03:38 PM

Fuse and BRMS Tooling Maintenance Release for Neon.3

by pleacu at May 24, 2017 02:03 PM

Try our complete Eclipse-Neon capable, Devstudio 10.4.0 compatible integration tooling.

jbosstools jbdevstudio blog header

JBoss Tools Integration Stack 4.4.3.Final / JBoss Developer Studio Integration Stack 10.3.0.GA

All of the Integration Stack components have been verified to work with the same dependencies as JBoss Tools 4.4 and Developer Studio 10.

What’s new for this release?

This release syncs up with Devstudio 10.4.0, JBoss Tools 4.4.4 and Eclipse Neon.3. It is also a maintenance release for Fuse Tooling, SwitchYard and the BRMS tooling.

Released Tooling Highlights

JBoss Fuse Development Highlights

Fuse Tooling Highlights

See the Fuse Tooling 9.2.0.Final Resolved Issues Section of the Integration Stack 10.3.0.GA release notes.

SwitchYard Highlights

See the SwitchYard 2.3.1.Final Resolved Issues Section of the Integration Stack 10.3.0.GA release notes.

JBoss Business Process and Rules Development

BPMN2 Modeler Known Issues

See the BPMN2 1.3.3.Final Known Issues Section of the Integration Stack 10.3.0.GA release notes.

Drools/jBPM6 Known Issues

Data Virtualization Highlights

Teiid Designer Known Issues

See the Teiid Designer 11.0.1.Final Resolved Issues Section of the Integration Stack 10.1.0.GA release notes.

What’s an Integration Stack?

Red Hat JBoss Developer Studio Integration Stack is a set of Eclipse-based development tools. It further enhances the IDE functionality provided by JBoss Developer Studio, with plug-ins specifically for use when developing for other Red Hat JBoss products. It’s where the Fuse Tooling, DataVirt Tooling and BRMS tooling is aggregated. The following frameworks are supported:

JBoss Fuse Development

  • Fuse Tooling - JBoss Fuse Development provides tooling for Red Hat JBoss Fuse. It features the latest versions of the Fuse Data Transformation tooling, Fuse Integration Services support, SwitchYard and access to the Fuse SAP Tool Suite.

  • SwitchYard - A lightweight service delivery framework providing full lifecycle support for developing, deploying, and managing service-oriented applications.

JBoss Business Process and Rules Development

JBoss Business Process and Rules Development plug-ins provide design, debug and testing tooling for developing business processes for Red Hat JBoss BRMS and Red Hat JBoss BPM Suite.

  • BPEL Designer - Orchestrating your business processes.

  • BPMN2 Modeler - A graphical modeling tool which allows creation and editing of Business Process Modeling Notation diagrams using graphiti.

  • Drools - A Business Logic integration Platform which provides a unified and integrated platform for Rules, Workflow and Event Processing including KIE.

  • jBPM6 - A flexible Business Process Management (BPM) suite.

JBoss Data Virtualization Development

JBoss Data Virtualization Development plug-ins provide a graphical interface to manage various aspects of Red Hat JBoss Data Virtualization instances, including the ability to design virtual databases and interact with associated governance repositories.

  • Teiid Designer - A visual tool that enables rapid, model-driven definition, integration, management and testing of data services without programming using the Teiid runtime framework.

JBoss Integration and SOA Development

JBoss Integration and SOA Development plug-ins provide tooling for developing, configuring and deploying BRMS, SwitchYard and Fuse applications to Red Hat JBoss Fuse and Fuse Fabric containers, Apache ServiceMix, and Apache Karaf instances.

  • All of the Business Process and Rules Development plugins, plus…​

  • Fuse Apache Camel Tooling - A graphical tool for integrating software components that works with Apache ServiceMix, Apache ActiveMQ, Apache Camel and the FuseSource distributions.

  • SwitchYard - A lightweight service delivery framework providing full lifecycle support for developing, deploying, and managing service-oriented applications.

The JBoss Tools website features tab

Don’t miss the Features tab for up to date information on your favorite Integration Stack components.

Installation

The easiest way to install the Integration Stack components is through the stand-alone installer. If you’re interested specifically in Fuse we have the all-in-one installer JBoss Fuse Tooling + JBoss Fuse/Karaf runtime.

For a complete set of Integration Stack installation instructions, see Integration Stack Installation Instructions

Give it a try!

Paul Leacu.


by pleacu at May 24, 2017 02:03 PM

JBoss Tools and Red Hat Developer Studio Maintenance Release for Eclipse Neon.3

by jeffmaury at May 24, 2017 12:22 PM

JBoss Tools 4.4.4 and Red Hat JBoss Developer Studio 10.4 for Eclipse Neon.3 are here waiting for you. Check it out!

devstudio10

Installation

JBoss Developer Studio comes with everything pre-bundled in its installer. Simply download it from our Red Hat developers and run it like this:

java -jar devstudio-<installername>.jar

JBoss Tools or Bring-Your-Own-Eclipse (BYOE) JBoss Developer Studio require a bit more:

This release requires at least Eclipse 4.6.3 (Neon.3) but we recommend using the latest Eclipse 4.6.3 Neon JEE Bundle since then you get most of the dependencies preinstalled.

Once you have installed Eclipse, you can either find us on the Eclipse Marketplace under "JBoss Tools" or "Red Hat JBoss Developer Studio".

For JBoss Tools, you can also use our update site directly.

http://download.jboss.org/jbosstools/neon/stable/updates/

What is new?

Our main focus for this release was improvements for container based development and bug fixing.

Improved OpenShift 3 and Docker Tools

We continue to work on providing better experience for container based development in JBoss Tools and Developer Studio. Let’s go through a few interesting updates here.

OpenShift Server Adapter enhanced flexibility

OpenShift server adapter is a great tool that allows developers to synchronize local changes in the Eclipse workspace with running pods in the OpenShift cluster. It also allows you to remote debug those pods when the server adapter is launched in Debug mode. The supported stacks are Java and NodeJS.

As pods are ephemeral OpenShift resources, the server adapter definition was based on an OpenShift service resource and the pods are then dynamically computed from the service selector.

This has a major drawback as it allows to use this feature only for pods that are part of a service, which may be logical for Web based applications as a route (and thus a service) is required in order to access the application.

So, it is now possible to create a server adapter from the following OpenShift resources:

  • service (as before)

  • deployment config

  • replication controller

  • pod

If a server adapter is created from a pod, it will be created from the associated OpenShift resource, in the preferred order:

  • service

  • deployment config

  • replication controller

As the OpenShift explorer used to display OpenShift resources that were linked to a service, it has been enhanced as well. It now displays resources linked to a deployment config or replication controller. Here is an example of a deployment with no service ie a deployment config:

server adapter enhanced

So, as an OpenShift server adapter can be created from different kind of resources, the kind of associated resource is displayed when creating the OpenShift server adapter:

server adapter enhanced1

Once created, the kind of OpenShift resource adapter is also displayed in the Servers view:

server adapter enhanced2

This information is also available from the server editor:

server adapter enhanced3

Security vulnerability fixed in certificate validation database

When you use the OpenShift tooling to connect to an OpenShift API server, the certificate of the OpenShift API server is first validated. If the issuer authority is a known one, then the connection is then established. If the issuer is an unknown one, a validation dialog is first shown to the user with the details of the OpenShift API server certificate as well as the details of the issuer authority. If the user accepts it, then the connection is established. There is also an option to store the certificate in a database so that next time a connection is attempted to the same OpenShift API server, then the certificate will be considered valid an no validation dialog will be show again.

certificate validation dialog

We found a security vulnerability as the certificate was wrongly stored: it was partially stored (not all attributes were stored) so we may interpret a different certificate as validated where it should not.

We had to change the format of the certificate database. As the certificates stored in the previous database were not entirelly stored, there was no way to provide a migration path. As a result, after the upgrade, the certificate database will be empty. So if you had previously accepted some certificates, then you need to accept them again and fill the certificate database again.

CDK 3 Server Adapter

The CDK 3 server adapter has been here for quite a long time. It used to be Tech Preview as CDK 3 was not officially released. It is now officially available. While the server adapter itself has limited functionality, it is able to start and stop the CDK virtual machine via its minishift binary. Simply hit Ctrl+3 (Cmd+3 on OSX) and type CDK, that will bring up a command to setup and/or launch the CDK server adapter. You should see the old CDK 2 server adapter along with the new CDK 3 one (labeled Red Hat Container Development Kit 3).

cdk3 server adapter5

All you have to do is set the credentials for your Red Hat account and the location of the CDK’s minishift binary file and the type of virtualization hypervisor.

cdk3 server adapter1

Once you’re finished, a new CDK Server adapter will then be created and visible in the Servers view.

cdk3 server adapter2

Once the server is started, Docker and OpenShift connections should appear in their respective views, allowing the user to quickly create a new Openshift application and begin developing their AwesomeApp in a highly-replicable environment.

cdk3 server adapter3
cdk3 server adapter4

OpenShift Container Platform 3.5 support

OpenShift Container Platform (OCP) 3.5 has been announced by Red Hat. JBossTools 4.4.4.Final has been validated against OCP 3.5.

OpenShift server adapter extensibility

The OpenShift server adapter had long support for EAP/Wildfly and NodeJS based deployments. It turns out that it does a great deal of synchronizing local workspace changes to remote deployments on OpenShift which have been standardized through images metadata (labels). But each runtime has its own specific. As an example, Wildfly/EAP deployments requires that a re-deploy trigger is sent after the files have been synchronized.

In order to reduce the technical debt and allow support for other runtimes (lots of them in the microservice world), we have refactored the OpenShift server adapter so that each runtime specific is now isolated and that it will be easy and safe to add support for new runtime.

For a full in-depth description, see the following wiki page.

Pipeline builds support

Pipeline based builds are now supported by the OpenShift tooling. When creating an application, if using a template, if one of the builds is based on pipeline, you can view the detail of the pipeline:

pipeline wizard

When your application is deployed, you can see the details of the build configuration for the pipeline based builds:

pipeline details

More to come as we are improving the pipeline support in the OpenShift tooling.

Update of Docker Client

The level of the underlying com.spotify.docker.client plug-in used to access the Docker daemon has been upgraded to 3.6.8.

Run Image Network Support

A new page has been added to the Docker Run Image Wizard and Docker Run Image Launch configuration that allows the end-user to specify the network mode to use. A user can choose from Default, Bridge, Host, None, Container, or Other. If Container is selected, the user must choose from an active Container to use the same network mode. If Other is specified, a named network can be specified.

Network Mode
Network Mode Configuration

Refresh Connection

Users can now refresh the entire connection from the Docker Explorer View. Refresh can be performed two ways:

  1. using the right-click context menu from the Connection

  2. using the Refresh menu button when the Connection is selected

Refresh Connection

Server Tools

API Change in JMX UI’s New Connection Wizard

While hardly something most users will care about, extenders may need to be aware that the API for adding connection types to the &aposNew JMX Connection&apos wizard in the &aposJMX Navigator&apos has changed. Specifically, the &aposorg.jboss.tools.jmx.ui.providerUI&apos extension point has been changed. While previously having a child element called &aposwizardPage&apos, it now requires a &aposwizardFragment&apos.

A &aposwizardFragment&apos is part of the &aposTaskWizard&apos framework first used in WTP’s ServerTools, which has, for a many years, been used throughout JBossTools. This framework allows wizard workflows where the set of pages to be displayed can change based on what selections are made on previous pages.

This change was made as a direct result of a bug caused by the addition of the Jolokia connection type in which some standard workflows could no longer be completed.

This change only affects adopters and extenders, and should have no noticable change for the user, other than that the below bug has been fixed.

Hibernate Tools

Hibernate Runtime Provider Updates

A number of additions and updates have been performed on the available Hibernate runtime providers.

Hibernate Runtime Provider Updates

The Hibernate 5.0 runtime provider now incorporates Hibernate Core version 5.0.12.Final and Hibernate Tools version 5.0.5.Final.

The Hibernate 5.1 runtime provider now incorporates Hibernate Core version 5.1.4.Final and Hibernate Tools version 5.1.3.Final.

The Hibernate 5.2 runtime provider now incorporates Hibernate Core version 5.2.8.Final and Hibernate Tools version 5.2.2.Final.

Forge Tools

Forge Runtime updated to 3.6.1.Final

The included Forge runtime is now 3.6.1.Final. Read the official announcement here.

startup

What is next?

Having JBoss Tools 4.4.4 and Developer Studio 10.4 out we are already working on the next release for Eclipse Oxygen.

Enjoy!

Jeff Maury


by jeffmaury at May 24, 2017 12:22 PM