Visible to Intel only — GUID: GUID-FBAC10B3-2DC2-4D19-8F98-8DE4F8C82705
Visible to Intel only — GUID: GUID-FBAC10B3-2DC2-4D19-8F98-8DE4F8C82705
DBSCAN
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed in [Ester96]. It is a density-based clustering non-parametric algorithm: given a set of observations in some space, it groups together observations that are closely packed together (observations with many nearby neighbors), marking as outliers observations that lie alone in low-density regions (whose nearest neighbors are too far away).
Operation |
Computational methods |
Programming Interface |
||
Default method |
Mathematical formulation
Refer to Developer Guide: DBSCAN.
Programming Interface
All types and functions in this section are declared in the oneapi::dal::dbscan namespace and are available via inclusion of the oneapi/dal/algo/dbscan.hpp header file.
Descriptor
template<typenameFloat=float,typenameMethod=method::by_default,typenameTask=task::by_default>classdescriptor
- Template Parameters
-
Float – The floating-point type that the algorithm uses for intermediate computations. Can be float or double.
Method – Tag-type that specifies an implementation of algorithm. Can be method::brute_force.
Task – Tag-type that specifies the type of the problem to solve. Can be task::clustering.
Constructors
descriptor(doubleepsilon, std::int64_tmin_observations)
Creates a new instance of the class with the given epsilon, min_observations.
Properties
doubleepsilon
The distance epsilon for neighbor search.
- Getter & Setter
-
double get_epsilon() const
auto & set_epsilon(double value)
- Invariants
-
epsilon >= 0.0
std::int64_tmin_observations
The number of neighbors.
- Getter & Setter
-
std::int64_t get_min_observations() const
auto & set_min_observations(std::int64_t value)
result_option_idresult_options
Choose which results should be computed and returned.
- Getter & Setter
-
result_option_id get_result_options() const
auto & set_result_options(const result_option_id &value)
boolmem_save_mode
The flag for memory saving mode.
- Getter & Setter
-
bool get_mem_save_mode() const
auto & set_mem_save_mode(bool value)
Method tags
structbrute_force
usingby_default=brute_force
Task tags
structclustering
Tag-type that parameterizes entities used for solving clustering problem.
usingby_default=clustering
Alias tag-type for the clustering task.
Computation compute(...)
Input
template<typenameTask=task::by_default>classcompute_input
- Template Parameters
-
Task – Tag-type that specifies type of the problem to solve. Can be task::clustering.
Constructors
compute_input(consttable&data={}, consttable&weights={})
Creates a new instance of the class with the given data and weights.
Properties
consttable&weights
A single column table with the weights, where each row stores one weight per observation.
- Getter & Setter
-
const table & get_weights() const
auto & set_weights(const table &weights)
consttable&data
An table with the data to be clustered, where each row stores one feature vector.
- Getter & Setter
-
const table & get_data() const
auto & set_data(const table &data)
Result
template<typenameTask=task::by_default>classcompute_result
- Template Parameters
-
Task – Tag-type that specifies type of the problem to solve. Can be task::clustering.
Constructors
compute_result()
Creates a new instance of the class with the default property values.
Properties
consttable&core_observations
An table with the core observations in the input data. is a number of core observations.
- Getter & Setter
-
const table & get_core_observations() const
auto & set_core_observations(const table &value)
constresult_option_id&result_options
Result options that indicates availability of the properties. Default value: default_result_options<Task>.
- Getter & Setter
-
const result_option_id & get_result_options() const
auto & set_result_options(const result_option_id &value)
consttable&responses
An table with the responses assigned to the samples in the input data. Default value: table{}.
- Getter & Setter
-
const table & get_responses() const
auto & set_responses(const table &value)
consttable&core_observation_indices
An table with the indices of core observations in the input data. is a number of core observations.
- Getter & Setter
-
const table & get_core_observation_indices() const
auto & set_core_observation_indices(const table &value)
std::int64_tcluster_count
The number of clusters found by the algorithm.
- Getter & Setter
-
std::int64_t get_cluster_count() const
auto & set_cluster_count(std::int64_t value)
- Invariants
-
cluster_count >= 0
consttable&core_flags
An table with the core flags assigned to the samples in the input data.
- Getter & Setter
-
const table & get_core_flags() const
auto & set_core_flags(const table &value)
Operation
template<typenameDescriptor>dbscan::compute_resultcompute(constDescriptor&desc, constdbscan::compute_input&input)
- Parameters
-
desc – DBSCAN algorithm descriptor dbscan::descriptor
input – Input data for the compute operation
Usage Example
Compute
void run_compute(const table& data,
const table& weights) {
double epsilon = 1.0;
std::int64_t max_observations = 5;
const auto dbscan_desc = kmeans::descriptor<float>{epsilon, max_observations}
.set_result_options(dal::dbscan::result_options::responses);
const auto result = compute(dbscan_desc, data, weights);
print_table("responses", result.get_responses());
}
Examples
oneAPI DPC++
Batch Processing:
oneAPI C++
Batch Processing: