PHP in 2018

Festival de Software Libre

Puerto Vallarta

Oct.28, 2017

http://talks.php.net/fsl17

Rasmus Lerdorf
@rasmus

1980s

1990s

1993

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

#define ishex(x) (((x) >= '0' && (x) <= '9') || ((x) >= 'a' && \
                   (x) <= 'f') || ((x) >= 'A' && (x) <= 'F'))

int htoi(char *s) {
	int     value;
	char    c;

	c = s[0];
	if(isupper(c)) c = tolower(c);
	value=(c >= '0' && c <= '9' ? c - '0' : c - 'a' + 10) * 16;

	c = s[1];
	if(isupper(c)) c = tolower(c);
	value += c >= '0' && c <= '9' ? c - '0' : c - 'a' + 10;

	return(value);
}

void main(int argc, char *argv[]) {
	char *params, *data, *dest, *s, *tmp;
	char *name, *age;

	puts("Content-type: text/html\r\n");
	puts("<HTML><HEAD><TITLE>Form Example</TITLE></HEAD>");
	puts("<BODY><H1>My Example Form</H1>");
	puts("<FORM action=\"form.cgi\" method=\"GET\">");
	puts("Name: <INPUT type=\"text\" name=\"name\">");
	puts("Age: <INPUT type=\"text\" name=\"age\">");
	puts("<BR><INPUT type=\"submit\">");
	puts("</FORM>");

	data = getenv("QUERY_STRING");
	if(data && *data) {
		params = data; dest = data;
    	while(*data) {
			if(*data=='+') *dest=' ';
			else if(*data == '%' && ishex(*(data+1))&&ishex(*(data+2))) {
				*dest = (char) htoi(data + 1);
				data+=2;
			} else *dest = *data;
			data++;
			dest++;
		}
		*dest = '\0';
		s = strtok(params,"&");
		do {
			tmp = strchr(s,'=');
			if(tmp) {
				*tmp = '\0';
				if(!strcmp(s,"name")) name = tmp+1;
				else if(!strcmp(s,"age")) age = tmp+1;
			}
		} while(s=strtok(NULL,"&"));

		printf("Hi %s, you are %s years old\n",name,age);
	}
	puts("</BODY></HTML>");
}

1993

use CGI qw(:standard);
print header;
print start_html('Form Example'),
    h1('My Example Form'),
    start_form,
    "Name: ", textfield('name'),
    p,
    "Age: ", textfield('age'),
    p,
    submit,
    end_form;
if(param()) {
    print "Hi ",em(param('name')),
        "You are ",em(param('age')),
        " years old";
}
print end_html;

1994-1995

<html><head><title>Form Example</title></head>
<body><h1>My Example Form</h1>
<form action="form.phtml" method="POST">
Name: <input type="text" name="name">
Age: <input type="text" name="age">
<br><input type="submit">
</form>
<?if($name):?>
Hi <?echo $name?>, you are <?echo $age?> years old
<?endif?>
</body></html>

✔ engine improvements

  • 100%+ performance gain on most real-world applications
  • Lower memory usage, sometimes drastically lower

JIT?

Improve CPU cache usage

  • Step 1: Decrease overall data
  • Step 2: Better data locality and less indirections
  • Step 3: Save the world!
  • zval size reduced from 24 to 16 bytes
  • Hashtable size reduced from 72 to 56 bytes
  • Hashtable bucket size reduced from 72 to 32 bytes
  • Immutable array optimization
$a = [];
for($i=0; $i < 100000;$i++) {
    $a[] = ['abc','def','ghi','jkl','mno','pqr'];
}
echo memory_get_usage(true);

// PHP 5.x  109M
// PHP 7.0   42M no opcache
// PHP 7.0    6M with opcache enabled
  • New memory allocator similar to jemalloc
  • Faster hashtable iteration API
  • Array duplication optimization
  • PCRE JIT enabled by default
  • Precomputed string hashes
  • Fast ZPP (ZendParseParameters) implementation
  • Faster stack-allocated zvals (instead of heap)
  • Optimized VM calling
  • Global register variables with gcc 4.8+
  • plus hundreds of micro-optimizations

JIT?

GCC Feedback-Directed Optimization (FDO)

$ gcc --version
gcc (Debian 6.3.0-14) 6.3.0 20170415

$ make clean
$ make -j8 prof-gen
...
$ sapi/cgi/php-cgi -T 1000 /var/www/wordpress/index.php > /dev/null
$ make prof-clean
$ make -j8 prof-use

PHP 7 in production


Saving the Planet?

  • Around 2 billion sites on the web
  • On 10 million physical machines
  • PHP drives at least 50%
  • Currently ~5% PHP 7 Adoption
  • which is about 250k physical servers
  • 3000 KWH/year per server costs approx. US$400
  • Data center cooling doubles that
  • 0.5kg CO2 per KWH

At 5% Adoption

  • US $200M savings
  • 750M KWH Savings
  • 375M kg less CO2

At 100% Adoption

  • $4B savings
  • 15B KWH Savings
  • 7.5B kg less CO2

Do your part

Upgrade to PHP 7!

PHP 7.2

Initial DCE and SCCP optimizations

Parameter Type Widening

class Orig {
  public function fn(array $arg) {  }
}
class Wider extends Orig {
  public function fn($arg) { }
}

In PHP 7.1 you would get:

Warning: Declaration of Wider::fn($arg) should be 
         compatible with Orig::fn(array $arg)

Allow trailing commas everywhere

// Arrays (already possible)
$array = [1, 2, 3,];

// Grouped namepaces
use Foo\Bar\{ Foo, Bar, Baz, };
 
// Function/method arguments (call)
fooCall($arg1, $arg2, $arg3,);

use Foo\Bar\{ Foo, Bar, Baz, };

class Foo implements
    // Interface implementations on a class
    FooInterface,
    BarInterface,
    BazInterface,
{
    // Trait implementations on a class
    use
        FooTrait,
        BarTrait,
        BazTrait,
    ;
 
    // Class member lists
    const
        A = 1010,
        B = 1021,
        C = 1032,
        D = 1043,
    ;

Object typehint

function fn(object $obj): object {
	return json_decode('{}');
}
fn("not an object");
Warning: Uncaught TypeError: Argument 1 passed to fn() must be an object,
         string given, called in php shell code on line 1 and defined in ...

Deprecate unquoted strings

echo HELLO;
Warning: Use of undefined constant HELLO - assumed 'HELLO' (this will throw
         an Error in a future version of PHP) in php shell code on line 1

extra headers array arg for mail() and mb_send_mail()

$to      = 'nobody@example.com';
$subject = 'the subject';
$message = 'hello';
$headers = ['From'     => 'webmaster@example.com',
            'Reply-To' => 'webmaster@example.com',
            'X-Mailer' => 'PHP/' . phpversion() ];

mail($to, $subject, $message, $headers);

Argon2i added to password_hash()

$options = [
    'memory_cost' => 2048,
    'time_cost' => 10,
    'threads' => 2
];
echo password_hash("rasmuslerdorf", PASSWORD_ARGON2I, $options);
$argon2i$v=19$m=2048,t=10,p=2$akUuWkoxRDlMSi5TVGhYLg$j0D1Cl4aR8UMHOGGx5JtZ1BmCApr8RmOJA9qFWm5mz8

Add Sodium Crypto Library

// On Alice's computer:
$msg = 'This comes from Alice.';
$signed_msg = sodium_crypto_sign($msg, $secret_sign_key);
// On Bob's computer:
$original_msg = sodium_crypto_sign_open($signed_msg, $alice_sign_publickey);
if ($original_msg === false) {
    throw new Exception("Invalid signature");
} else {
    echo $original_msg; // Displays "This comes from Alice."
}
see: https://paragonie.com/book/pecl-libsodium

Things that may break your code

  • mcrypt extension has been removed
  • __autoload() deprecated, use spl_autoload_register()
  • create_function() deprecated, use anonymous functions
  • each() deprecated, use foreach()
  • read_exif_data() deprecated, use exif_read_data()

Full details are at:

https://github.com/php/php-src/blob/PHP-7.2/UPGRADING

And for extension authors:

https://github.com/php/php-src/blob/PHP-7.2/UPGRADING.INTERNALS

Dead Code Elimination (DCE)

Escape Analysis

Sparse Conditional Constant Propagation

php -d opcache.optimization_level=-1 -d opcache.opt_debug_level=0x20000 script
function fn() {
    $a = 1;
    return 0;
}

PHP 7.1

fn: (lines=2, args=0, vars=1, tmps=0)
L0:   ASSIGN CV0($a) int(1)
L1:   RETURN int(0)

PHP 7.2

fn: (lines=1, args=0, vars=0, tmps=0)
L0:   RETURN int(0)
function foo(string $s1, string $s2, string $s3, string $s4) {
    $x = ($s1 . $s2) . ($s3 . $s4);
    $x = 0;
    return $x;
}
PHP 7.1                                   PHP 7.2
foo: (lines=10, args=4, vars=5, tmps=3)   foo: (lines=5, args=4, vars=4, tmps=3)
L0:   CV0($s1) = RECV 1                   L0:   CV0($s1) = RECV 1
L1:   CV1($s2) = RECV 2                   L1:   CV1($s2) = RECV 2
L2:   CV2($s3) = RECV 3                   L2:   CV2($s3) = RECV 3
L3:   CV3($s4) = RECV 4                   L3:   CV3($s4) = RECV 4
L4:   T6 = CONCAT CV0($s1) CV1($s2)       L4:   RETURN int(0)
L5:   T7 = CONCAT CV2($s3) CV3($s4)
L6:   T5 = CONCAT T6 T7
L7:   ASSIGN CV4($x) T5
L8:   ASSIGN CV4($x) int(0)
L9:   RETURN CV4($x)

Try to trick it

function foo($a) {
    $b = $a += 3;
    return $a;
}

PHP 7.2

foo: (lines=3, args=1, vars=1, tmps=1)
L0:   CV0($a) = RECV 1
L1:   ASSIGN_ADD CV0($a) int(3)
L2:   RETURN CV0($a)

But...

function foo(int $x, int $y) {
    $a = [$x];
    $a[1] = $y;
    $a = $y;
    return $a;
}
PHP 7.2                                    PHP 7.3
foo: (lines=7, args=2, vars=3, tmps=1)     foo: (lines=4, args=2, vars=3, tmps=0)
L0:   CV0($x) = RECV 1                     L0:     CV0($x) = RECV 1
L1:   CV1($y) = RECV 2                     L1:     CV1($y) = RECV 2
L2:   CV2($a) = INIT_ARRAY 1 CV0($x) NEXT  L2:     CV2($a) = QM_ASSIGN CV1($y)
L3:   ASSIGN_DIM CV2($a) int(1)            L3:     RETURN CV2($a)
L4:   OP_DATA CV1($y)
L5:   ASSIGN CV2($a) CV1($y)
L6:   RETURN CV2($a)
class A { }
function foo(int $x) {
    $a = new A;
    $a->foo = $x;
    return $x;
}

PHP 7.3

foo: (lines=2, args=1, vars=1, tmps=0)
L0:   CV0($x) = RECV 1
L1:   RETURN CV0($x)
class A {
    function __destruct() {}
}
function foo(int $x) {
    $a = new A;
    $a->foo = $x;
    return $x;
}

PHP 7.3

foo: (lines=7, args=1, vars=2, tmps=1)
L0:   CV0($x) = RECV 1
L1:   V2 = NEW 0 string("A")
L2:   DO_FCALL
L3:   CV1($a) = QM_ASSIGN V2
L4:   ASSIGN_OBJ CV1($a) string("foo")
L5:   OP_DATA CV0($x)
L6:   RETURN CV0($x)
function foo() {
    $a = 1;
    $b = $a + 2;
    $a += $b;
    return $a;
}

PHP 7.2

foo: (lines=1, args=0, vars=0, tmps=1)
L0:   RETURN int(4)

PHP 7.3

foo: (lines=1, args=0, vars=0, tmps=0)
L0:   RETURN int(4)
function foo(int $x) {
    if ($x) {
        $a = [0,1];
    } else {
        $a = [0,2];
    }
    return $a[0];
}

PHP 7.3

foo: (lines=2, args=1, vars=1, tmps=0)
L0:   CV0($x) = RECV 1
L1:   RETURN int(0)
function foo() {
    $o = new stdClass();
    $o->foo = 0;
    $i = 1;
    $c = $i < 2;
    if ($c) {
        $k = 2 * $i;
        $o->foo = $i;
        echo $o->foo;
    }
    $o->foo += 2;
    $o->foo++;
    return $o->foo;
}

PHP 7.3

foo: (lines=2, args=0, vars=0, tmps=0)
L0:   ECHO int(1)
L1:   RETURN int(4)

Static Analysis



github.com/phan/phan

It can catch dumb mistakes

$a = [1,2,3];
if(count($a > 1)) {
    echo "Test";
}
% phan test.php
test.php:2 PhanTypeComparisonFromArray array to int comparison

Check phpdoc comments

class C {
    /** @var int $prop */
    public $prop;

    /**
     * @param string $arg
     * @return int
     */
    function test(?$arg) {
        $this->prop = $arg;
        return $arg;
    }
}
% phan test.php
test.php:9 PhanTypeMismatchDeclaredParamNullable Doc-block of $arg in test
           is phpdoc param type string which is not a permitted replacement
           of the nullable param type ?string declared in the signature 
           ('?T' should be documented as 'T|null' or '?T')
test.php:10 PhanTypeMismatchProperty Assigning string to property
            but \C::prop is int
test.php:11 PhanTypeMismatchReturn Returning type string 
            but test() is declared to return int

Help with refactoring

class C {
    /**
     * @deprecated
     */
    static function legacy_function() { }
}

C::legacy_function();
% phan test.php
test.php:8 PhanDeprecatedFunction Call to deprecated function 
           \C::legacy_function() defined at test.php:5

Install with composer

$ composer require --dev phan/phan

Create .phan/config.php

<?php
use \Phan\Issue;
return [
    'should_visit_all_nodes' => true,
    'minimum_severity' => Issue::SEVERITY_LOW,
    'directory_list' => [ 'src', 'vendor' ],
    'exclude_analysis_directory_list' => [ 'vendor' ]
];
$ ./vendor/bin/phan

Type Safety

Legacy code

class Data {
    function __construct($data) {
        $this->haystack = $data;
    }
    function find($needle) {
        return in_array($needle, $this->haystack, true);
    }
}
$storage = new Data(['apple','orange','banana']);

$fruit = false;
$storage->find($fruit);

Going straight to strict types risks runtime fatals

<?php declare(strict_types=1);
class Data {
    function __construct(array $data) {
        $this->haystack = $data;
    }
    function find(string $needle):bool {
        return in_array($needle, $this->haystack, true);
    }
}
$storage = new Data(['apple','orange','banana']);

$fruit = false;
$storage->find($fruit);
Fatal error: Uncaught TypeError: Argument 1 passed to Data::find() must be of the type string, boolean given,
                                 called in test.php on line 13 and defined in test.php:6
Stack trace:
#0 test.php(13): Data->find(false)
#1 {main}
thrown in test.php on line 6

Intermediate step

class Data {
    /** @var array $haystack */
    public $haystack;

    /**
     * @param array $data
     */
    function __construct($data) {
        $this->haystack = $data;
    }
    /**
     * @param string $needle
     * @return bool
     */
    function find($needle) {
        return in_array($needle, $this->haystack, true);
    }
}
$storage = new Data(['apple','orange','banana']);

$fruit = false;
$storage->find($fruit);
$ phan test.php
test.php:22 PhanTypeMismatchArgument Argument 1 (needle) is bool
            but \Data::find() takes string defined at test.php:15

Thank You

http://talks.php.net/fsl17
https://github.com/phan/phan
https://paragonie.com/book/pecl-libsodium
https://github.com/php/php-src/blob/PHP-7.2/UPGRADING
https://bugs.php.net



Report Bugs

Useful bug reports, please!