Convert Anything to Tree Structures in PHP

6 minute read

I recently faced a programming challenge that almost broke my brain. I needed to create a function that could explode any single-dimensional array into a full blown tree structure, based on the delimiters found in it's keys. Tricky part was size of the tree could be infinite. I called the function: explodeTree. And maybe it's best to first look at an example.

The Directory Example

Here I will give an example what the explodeTree function could be used for. Let's say we need a recursive directory listing of /etc/php5, and for that we execute:

<?php
if(exec("find /etc/php5", $files)){
    // the $files array now holds the path as it's values,
    // but we also want the paths as keys:
    $key_files = array_combine(array_values($files), array_values($files));

    // show the array
    print_r($key_files);
}
?>

Which will return something like:

Array
(
    [/etc/php5] => /etc/php5
    [/etc/php5/cli] => /etc/php5/cli
    [/etc/php5/cli/conf.d] => /etc/php5/cli/conf.d
    [/etc/php5/cli/php.ini] => /etc/php5/cli/php.ini
    [/etc/php5/conf.d] => /etc/php5/conf.d
    [/etc/php5/conf.d/mysqli.ini] => /etc/php5/conf.d/mysqli.ini
    [/etc/php5/conf.d/curl.ini] => /etc/php5/conf.d/curl.ini
    [/etc/php5/conf.d/snmp.ini] => /etc/php5/conf.d/snmp.ini
    [/etc/php5/conf.d/gd.ini] => /etc/php5/conf.d/gd.ini
    [/etc/php5/apache2] => /etc/php5/apache2
    [/etc/php5/apache2/conf.d] => /etc/php5/apache2/conf.d
    [/etc/php5/apache2/php.ini] => /etc/php5/apache2/php.ini
)

Now if we want to transform this list into a tree structure with each directory as a nested node, a child of another directory, all we would have to do is run:

<?php
// let '/' be our delimiter
$tree = explodeTree($key_files, "/");
// show the array
print_r($tree);
?>

And that single command would give the totally awesome:

Array
(
    [etc] => Array
        (
            [php5] => Array
                (
                    [cli] => Array
                        (
                            [conf.d] => /etc/php5/cli/conf.d
                            [php.ini] => /etc/php5/cli/php.ini
                        )
                    [conf.d] => Array
                        (
                            [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
                            [curl.ini] => /etc/php5/conf.d/curl.ini
                            [snmp.ini] => /etc/php5/conf.d/snmp.ini
                            [gd.ini] => /etc/php5/conf.d/gd.ini
                        )

                    [apache2] => Array
                        (
                            [conf.d] => /etc/php5/apache2/conf.d
                            [php.ini] => /etc/php5/apache2/php.ini
                        )
                )
        )
)

Wow! So this would make it very easy to visually layout a tree structure of the directory /etc/php5. But remember this is just an example. The function now explodes on the '/' character, but you can use any delimiter to explode a single-dimensional array into a Tree. So how does this explodeTree function work?

The Function: explodeTree()

Thanks to Lachlan Donald and Takkie, for contributing to this() function.

<?php
/**
 * Explode any single-dimensional array into a full blown tree structure,
 * based on the delimiters found in it's keys.
 *
 * The following code block can be utilized by PEAR's Testing_DocTest
 * <code>
 * // Input //
 * $key_files = array(
 *	 "/etc/php5" => "/etc/php5",
 *	 "/etc/php5/cli" => "/etc/php5/cli",
 *	 "/etc/php5/cli/conf.d" => "/etc/php5/cli/conf.d",
 *	 "/etc/php5/cli/php.ini" => "/etc/php5/cli/php.ini",
 *	 "/etc/php5/conf.d" => "/etc/php5/conf.d",
 *	 "/etc/php5/conf.d/mysqli.ini" => "/etc/php5/conf.d/mysqli.ini",
 *	 "/etc/php5/conf.d/curl.ini" => "/etc/php5/conf.d/curl.ini",
 *	 "/etc/php5/conf.d/snmp.ini" => "/etc/php5/conf.d/snmp.ini",
 *	 "/etc/php5/conf.d/gd.ini" => "/etc/php5/conf.d/gd.ini",
 *	 "/etc/php5/apache2" => "/etc/php5/apache2",
 *	 "/etc/php5/apache2/conf.d" => "/etc/php5/apache2/conf.d",
 *	 "/etc/php5/apache2/php.ini" => "/etc/php5/apache2/php.ini"
 * );
 *
 * // Execute //
 * $tree = explodeTree($key_files, "/", true);
 *
 * // Show //
 * print_r($tree);
 *
 * // expects:
 * // Array
 * // (
 * //	 [etc] => Array
 * //		 (
 * //			 [php5] => Array
 * //				 (
 * //					 [__base_val] => /etc/php5
 * //					 [cli] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/cli
 * //							 [conf.d] => /etc/php5/cli/conf.d
 * //							 [php.ini] => /etc/php5/cli/php.ini
 * //						 )
 * //
 * //					 [conf.d] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/conf.d
 * //							 [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
 * //							 [curl.ini] => /etc/php5/conf.d/curl.ini
 * //							 [snmp.ini] => /etc/php5/conf.d/snmp.ini
 * //							 [gd.ini] => /etc/php5/conf.d/gd.ini
 * //						 )
 * //
 * //					 [apache2] => Array
 * //						 (
 * //							 [__base_val] => /etc/php5/apache2
 * //							 [conf.d] => /etc/php5/apache2/conf.d
 * //							 [php.ini] => /etc/php5/apache2/php.ini
 * //						 )
 * //
 * //				 )
 * //
 * //		 )
 * //
 * // )
 * </code>
 *
 * @author	Kevin van Zonneveld <kevin@vanzonneveld.net>
 * @author	Lachlan Donald
 * @author	Takkie
 * @copyright 2008 Kevin van Zonneveld (http://kevin.vanzonneveld.net)
 * @license   http://www.opensource.org/licenses/bsd-license.php New BSD Licence
 * @version   SVN: Release: $Id: explodeTree.inc.php 89 2008-09-05 20:52:48Z kevin $
 * @link	  http://kevin.vanzonneveld.net/
 *
 * @param array   $array
 * @param string  $delimiter
 * @param boolean $baseval
 *
 * @return array
 */
function explodeTree($array, $delimiter = '_', $baseval = false)
{
	if(!is_array($array)) return false;
	$splitRE   = '/' . preg_quote($delimiter, '/') . '/';
	$returnArr = array();
	foreach ($array as $key => $val) {
		// Get parent parts and the current leaf
		$parts	= preg_split($splitRE, $key, -1, PREG_SPLIT_NO_EMPTY);
		$leafPart = array_pop($parts);

		// Build parent structure
		// Might be slow for really deep and large structures
		$parentArr = &$returnArr;
		foreach ($parts as $part) {
			if (!isset($parentArr[$part])) {
				$parentArr[$part] = array();
			} elseif (!is_array($parentArr[$part])) {
				if ($baseval) {
					$parentArr[$part] = array('__base_val' => $parentArr[$part]);
				} else {
					$parentArr[$part] = array();
				}
			}
			$parentArr = &$parentArr[$part];
		}

		// Add the final part to the structure
		if (empty($parentArr[$leafPart])) {
			$parentArr[$leafPart] = $val;
		} elseif ($baseval && is_array($parentArr[$leafPart])) {
			$parentArr[$leafPart]['__base_val'] = $val;
		}
	}
	return $returnArr;
}
?>

The first to arguments of explodeTree() are clear I guess. But what about that 3rd parameter: $baseval?

The Baseval Argument

In the first example you see that only leafs (the bottom nodes that don't have any children) maintain their original values (the filepaths in this case). If you want higher nodes (parents) to also maintain their values, you'll have to tell explodeTree to do so like this:

<?php
// now the 3rd argument, the baseval, is true
$tree = explodeTree($key_files, "/", true);
?>

And then explodeTree will preserve the node's original value in the __base_val items. Like this:

Array
(
    [etc] => Array
        (
            [__base_val] =>
            [php5] => Array
                (
                    [__base_val] => /etc/php5
                    [cli] => Array
                        (
                            [__base_val] => /etc/php5/cli
                            [conf.d] => /etc/php5/cli/conf.d
                            [php.ini] => /etc/php5/cli/php.ini
                        )

                    [conf.d] => Array
                        (
                            [__base_val] => /etc/php5/conf.d
                            [mysqli.ini] => /etc/php5/conf.d/mysqli.ini
                            [curl.ini] => /etc/php5/conf.d/curl.ini
                            [snmp.ini] => /etc/php5/conf.d/snmp.ini
                            [gd.ini] => /etc/php5/conf.d/gd.ini
                        )
                    [apache2] => Array
                        (
                            [__base_val] => /etc/php5/apache2
                            [conf.d] => /etc/php5/apache2/conf.d
                            [php.ini] => /etc/php5/apache2/php.ini
                        )

                )

        )

)

See what happens? Baseval creates a placeholder. A semi-node for the original value of it's parent. The value: '/etc/php5' is now saved, without baseval this value would be lost because there was no place to store it in. That might come in handy!

So you've got a tree. Now what?

Trees with unlimited levels of nodes require recursive functions that can traverse the entire structure. Recursive functions are functions that call themselves every time they find more items to process. Here's one to layout the directories:

<?php
function plotTree($arr, $indent=0, $mother_run=true){
    if ($mother_run) {
        // the beginning of plotTree. We're at rootlevel
        echo "start\n";
    }

    foreach ($arr as $k=>$v){
        // skip the baseval thingy. Not a real node.
        if ($k == "__base_val") continue;
        // determine the real value of this node.
        $show_val = (is_array($v) ? $v["__base_val"] : $v);
        // show the indents
        echo str_repeat("  ", $indent);
        if ($indent == 0) {
            // this is a root node. no parents
            echo "O ";
        } elseif (is_array($v)){
            // this is a normal node. parents and children
            echo "+ ";
        } else {
            // this is a leaf node. no children
            echo "- ";
        }

        // show the actual node
        echo $k . " (" . $show_val. ")" . "\n";
        if (is_array($v)) {
            // this is what makes it recursive, rerun for childs
            plotTree($v, ($indent+1), false);
        }
    }

    if ($mother_run) {
        echo "end\n";
    }
}
?>

And this would output:

start
O etc ()
  + php5 (/etc/php5)
    + cli (/etc/php5/cli)
      - conf.d (/etc/php5/cli/conf.d)
      - php.ini (/etc/php5/cli/php.ini)
    + conf.d (/etc/php5/conf.d)
      - mysqli.ini (/etc/php5/conf.d/mysqli.ini)
      - curl.ini (/etc/php5/conf.d/curl.ini)
      - snmp.ini (/etc/php5/conf.d/snmp.ini)
      - gd.ini (/etc/php5/conf.d/gd.ini)
    + apache2 (/etc/php5/apache2)
      - conf.d (/etc/php5/apache2/conf.d)
      - php.ini (/etc/php5/apache2/php.ini)
end

If I overlooked a standard PHP function that can already do this, or you have other improvements/ideas leave a comment!

Thanks again: Lachlan Donald & Tokkie for insightful comments and great effort.

Leave a Comment Right Here