`combinatoricsGeneral.Rd`

Generate combinations or permutations of a vector with or without constraints.

Produce results in parallel using the

`Parallel`

or`nThreads`

arguments. You can also apply each of the five compiled functions given by the argument`constraintFun`

in parallel.The arguments

`lower`

and`upper`

make it possible to generate combinations/permutations in chunks allowing for parallelization via the`parallel-package`

. This is convenient when you want to apply a custom function to the output in parallel.Attack integer partition and general subset sum problems.

GMP support allows for exploration of combinations/permutations of vectors with many elements.

The output is in lexicographical order.

```
comboGeneral(v, m = NULL, repetition = FALSE, freqs = NULL,
lower = NULL, upper = NULL, constraintFun = NULL,
comparisonFun = NULL, limitConstraints = NULL,
keepResults = NULL, FUN = NULL, Parallel = FALSE,
nThreads = NULL, tolerance = NULL, FUN.VALUE = NULL)
permuteGeneral(v, m = NULL, repetition = FALSE, freqs = NULL,
lower = NULL, upper = NULL, constraintFun = NULL,
comparisonFun = NULL, limitConstraints = NULL,
keepResults = NULL, FUN = NULL, Parallel = FALSE,
nThreads = NULL, tolerance = NULL, FUN.VALUE = NULL)
```

- v
Source vector. If

`v`

is a positive integer, it will be converted to the sequence`1:v`

. If`v`

is a negative integer, it will be converted to the sequence`v:-1`

. All atomic types are supported (See`is.atomic`

).- m
Number of elements to choose. If

`repetition = TRUE`

or`freqs`

is utilized,`m`

can exceed the length of`v`

. If`m = NULL`

, the length will default to`length(v)`

or`sum(freqs)`

.- repetition
Logical value indicating whether combinations/permutations should be with or without repetition. The default is

`FALSE`

.- freqs
A vector of frequencies used for producing all combinations/permutations of a multiset of

`v`

. Each element of`freqs`

represents how many times each element of the source vector,`v`

, is repeated. It is analogous to the`times`

argument in`rep`

. The default value is`NULL`

.- lower
The lower bound. Combinations/permutations are generated lexicographically, thus utilizing this argument will determine which specific combination/permutation to start generating from (

*e.g.*`comboGeneral(5, 3, lower = 6)`

is equivalent to`comboGeneral(5, 3)[6:choose(5, 3), ]`

). This argument along with`upper`

is very useful for generating combinations/permutations in chunks allowing for easy parallelization.- upper
The upper bound. Similar to

`lower`

, however this parameter allows the user to*stop*generation at a specific combination/permutation (*e.g.*`comboGeneral(5, 3, upper = 5)`

is equivalent to`comboGeneral(5, 3)[1:5, ]`

)If the output is constrained and

`lower`

isn't supplied,`upper`

serves as a cap for how many results will be returned that meet the criteria (*e.g.*setting`upper = 100`

alone will return the first 100 results that meet the criteria, while setting`lower = 1`

and`upper = 100`

will test the first 100 results against the criteria).In addition to the benefits listed for

`lower`

, this parameter is useful when the total number of combinations/permutations without constraint is large and you expect/need a small number of combinations/permutations that meet a certain criteria. Using`upper`

can improve run time if used judiciously as we call the member function reserve of std::vector. See examples below.- constraintFun
Function to be applied to the elements of

`v`

that should be passed as a string (*e.g.*`constraintFun = "sum"`

). The possible constraint functions are:`"sum"`

,`"prod"`

,`"mean"`

,`"max"`

, &`"min"`

. The default is`NULL`

, meaning no function is applied.- comparisonFun
Comparison operator that will be used to compare

`limitConstraints`

with the result of`constraintFun`

applied to`v`

. It should be passed as a string or a vector of two strings (*e.g.*`comparisonFun = "<="`

or`comparisonFun = c(">","<")`

). The possible comparison operators are:`"<"`

,`">"`

,`"<="`

,`">="`

,`"=="`

. The default is`NULL`

.When

`comparisonFun`

is a vector of two comparison strings,*e.g*`comparisonFun = c(comp1, comp2)`

, and`limitConstraints`

is a vector of two numerical values,*e.g*`limitConstraints = c(x1, x2)`

, the combinations/permutations will be filtered in one of the following two ways:When

`comp1`

is one of the 'greater-than' operators (*i.e.*">=" or ">"),`comp2`

is one of the 'less-than' operators (*i.e.*"<=" or "<"), and`x1 < x2`

, the combinations/permutations that are returned will have a value (after`constraintFun`

has been applied) between`x1`

and`x2`

.When

`comp1`

and`comp2`

are defined as in #1 and`x1 > x2`

, the combinations/permutations that are returned will have a value outside the range of`x1`

and`x2`

. See the examples below.

In other words, the first comparison operator is applied to the first limit and the second operator is applied to the second limit.

- limitConstraints
This is the value(s) that will be used for comparison. Can be passed as a single value or a vector of two numerical values. The default is

`NULL`

. See the definition of`comparisonFun`

as well as the examples below for more information.- keepResults
A logical flag indicating if the result of

`constraintFun`

applied to`v`

should be displayed; if`TRUE`

, an additional column of results will be added to the resulting matrix. The default is`FALSE`

. If user is only applying`constraintFun`

,`keepResults`

will default to`TRUE`

.*E.g*. The following are equivalent and will produce a \(4^{th}\) column of row sums:`comboGeneral(5, 3 constraintFun = "sum", keepResults = TRUE)`

`comboGeneral(5, 3 constraintFun = "sum")`

- FUN
Function to be applied to each combination/permutation. The default is

`NULL`

.- Parallel
Logical value indicating whether combinations/permutations should be generated in parallel using \(n - 1\) threads, where \(n\) is the maximum number of threads. The default is

`FALSE`

. If`nThreads`

is not`NULL`

, it will be given preference (*e.g.*if user has 8 threads with`Parallel = TRUE`

and`nThreads = 4`

, only 4 threads will be spawned). If your system is single-threaded, the arguments`Parallel`

and`nThreads`

are ignored.- nThreads
Specific number of threads to be used. The default is

`NULL`

. See`Parallel`

.- tolerance
A numeric value greater than or equal to zero. This parameter is utilized when a constraint is applied on a numeric vector. The default value is 0 when it can be determined that whole values are being utilized, otherwise it is

`sqrt(.Machine$double.eps)`

which is approximately \(1.5e-8\). N.B. If the input vector is of type integer, this parameter will be ignored and strict equality will be enforced.- FUN.VALUE
A template for the return value from

`FUN`

. See 'Details' of`vapply`

for more information.

In general, a matrix with \(m\) or \(m + 1\) columns, depending on the value of

`keepResults`

If

`FUN`

is utilized and`FUN.VALUE = NULL`

, a list is returnedWhen both

`FUN`

and`FUN.VALUE`

are not`NULL`

, the return is modeled after the return of`vapply`

. See the 'Value' section of`vapply`

.

Finding all combinations/permutations with constraints is optimized by organizing them in such a way that when `constraintFun`

is applied, a *partially* monotonic sequence is produced. Combinations/permutations are added successively, until a particular combination exceeds the given constraint value for a given constraint/comparison function combo. After this point, we can safely skip several combinations knowing that they will exceed the given constraint value.

When there are any negative values in `v`

and `constraintFun = "prod"`

, producing a monotonic set is non-trivial for the general case. As a result, performance will suffer as all combinations/permutations must be tested against the constraint criteria.

`Parallel`

and`nThreads`

will be ignored in the following cases:When the output is constrained (except for most partitions cases)

If the class of the vector passed is

`character`

,`raw`

, and`complex`

(N.B.`Rcpp::CharacterMatrix`

is not thread safe). Alternatively, you can generate an indexing matrix in parallel.If

`FUN`

is utilized.

If either

`constraintFun`

,`comparisonFun`

or`limitConstraints`

is`NULL`

--or-- if the class of the vector passed is`logical`

,`character`

,`raw`

,`factor`

, or`complex`

, the constraint check will not be carried out. This is equivalent to simply finding all combinations/permutations of \(v\) choose \(m\).The maximum number of combinations/permutations that can be generated at one time is \(2^{31} - 1\). Utilizing

`lower`

and`upper`

makes it possible to generate additional combinations/permutations.Factor vectors are accepted. Class and level attributes are preserved except when

`FUN`

is used.Lexicographical ordering isn't guaranteed for permutations if

`lower`

isn't supplied and the output is constrained.If

`lower`

is supplied and the output is constrained, the combinations/permutations that will be tested will be in the lexicographical range`lower`

to`upper`

or up to the total possible number of results if`upper`

is not given. See the second paragraph for the definition of`upper`

.`FUN`

will be ignored if the constraint check is satisfied.

```
system.time(comboGeneral(17, 8))
#> user system elapsed
#> 0.000 0.000 0.001
system.time(permuteGeneral(13, 6))
#> user system elapsed
#> 0.003 0.002 0.006
system.time(comboGeneral(13,10,repetition = TRUE))
#> user system elapsed
#> 0.004 0.000 0.004
system.time(permuteGeneral(factor(letters[1:9]),6,TRUE))
#> user system elapsed
#> 0.001 0.001 0.001
## permutations of the multiset (with or w/o setting m) :
## c(1,1,1,1,2,2,2,3,3,4)
system.time(permuteGeneral(4, freqs = c(4,3,2,1)))
#> user system elapsed
#> 0 0 0
#### Examples using "upper" and "lower":
## Generate some random data
set.seed(1009)
mySamp = sort(rnorm(75, 997, 23))
permuteCount(75, 10, freqs = rep(1:3, 25))
#> Big Integer ('bigz') :
#> [1] 4606842579278688000
# >Big Integer ('bigz') :
# >[1] 4606842576291495952
## See specific range of permutations
permuteGeneral(75, 10, freqs = rep(1:3, 25),
lower = 1e12, upper = 1e12 + 10)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 2 2 10 22 18 69 57 47 25
#> [2,] 1 2 2 10 22 18 69 57 47 26
#> [3,] 1 2 2 10 22 18 69 57 47 27
#> [4,] 1 2 2 10 22 18 69 57 47 28
#> [5,] 1 2 2 10 22 18 69 57 47 29
#> [6,] 1 2 2 10 22 18 69 57 47 30
#> [7,] 1 2 2 10 22 18 69 57 47 31
#> [8,] 1 2 2 10 22 18 69 57 47 32
#> [9,] 1 2 2 10 22 18 69 57 47 33
#> [10,] 1 2 2 10 22 18 69 57 47 34
#> [11,] 1 2 2 10 22 18 69 57 47 35
## Researcher only needs 1000 7-tuples of mySamp
## such that the sum is greater than 7200.
system.time(comboGeneral(mySamp, 7, FALSE, constraintFun = "sum",
comparisonFun = ">", limitConstraints = 7200, upper = 1000))
#> user system elapsed
#> 0 0 0
## Similarly, you can use "lower" to obtain the last rows.
## Generate the last 10 rows
system.time(comboGeneral(mySamp, 7, lower = choose(75, 7) - 9))
#> user system elapsed
#> 0 0 0
## Or if you would like to generate a specific chunk,
## use both "lower" and "upper". E.g. Generate one
## million combinations starting with the 900,000,001
## lexicographic combination.
t1 = comboGeneral(mySamp, 7,
lower = 9*10^8 + 1,
upper = 9*10^8 + 10^6)
## class of the source vector is preserved
class(comboGeneral(5,3)[1,]) == class(1:5)
#> [1] TRUE
class(comboGeneral(c(1,2:5),3)[1,]) == class(c(1,2:5))
#> [1] TRUE
class(comboGeneral(factor(month.name),3)[1,]) == class(factor(month.name))
#> [1] TRUE
## Using keepResults will add a column of results
t2 = permuteGeneral(-3,6,TRUE,constraintFun = "prod",
keepResults = TRUE)
t3 = comboGeneral(-3,6,TRUE,constraintFun = "sum",
comparisonFun = "==",
limitConstraints = -8,
keepResults = TRUE)
## Using multiple constraints:
## Get combinations such that the product
## is between 3000 and 4000 inclusive
comboGeneral(5, 7, TRUE, constraintFun = "prod",
comparisonFun = c(">=","<="),
limitConstraints = c(3000, 4000),
keepResults = TRUE)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 1 5 5 5 5 5 3125
#> [2,] 1 2 3 4 5 5 5 3000
#> [3,] 1 2 3 5 5 5 5 3750
#> [4,] 1 2 4 4 4 5 5 3200
#> [5,] 1 2 4 4 5 5 5 4000
#> [6,] 1 3 3 3 5 5 5 3375
#> [7,] 1 3 3 4 4 5 5 3600
#> [8,] 1 3 4 4 4 4 4 3072
#> [9,] 1 3 4 4 4 4 5 3840
#> [10,] 2 2 2 3 5 5 5 3000
#> [11,] 2 2 2 4 4 5 5 3200
#> [12,] 2 2 2 4 5 5 5 4000
#> [13,] 2 2 3 3 4 5 5 3600
#> [14,] 2 2 3 4 4 4 4 3072
#> [15,] 2 2 3 4 4 4 5 3840
#> [16,] 2 3 3 3 3 4 5 3240
#> [17,] 2 3 3 3 4 4 4 3456
#> [18,] 3 3 3 3 3 3 5 3645
#> [19,] 3 3 3 3 3 4 4 3888
## Or, get the combinations such that the
## product is less than or equal to 10 or
## greater than or equal to 40000
comboGeneral(5, 7, TRUE, constraintFun = "prod",
comparisonFun = c("<=",">="),
limitConstraints = c(10, 40000),
keepResults = TRUE)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 1 1 1 1 1 1 1 1
#> [2,] 1 1 1 1 1 1 2 2
#> [3,] 1 1 1 1 1 1 3 3
#> [4,] 1 1 1 1 1 1 4 4
#> [5,] 1 1 1 1 1 1 5 5
#> [6,] 1 1 1 1 1 2 2 4
#> [7,] 1 1 1 1 1 2 3 6
#> [8,] 1 1 1 1 1 2 4 8
#> [9,] 1 1 1 1 1 2 5 10
#> [10,] 1 1 1 1 1 3 3 9
#> [11,] 1 1 1 1 2 2 2 8
#> [12,] 5 5 5 5 5 5 5 78125
#> [13,] 5 5 5 5 5 5 4 62500
#> [14,] 5 5 5 5 5 5 3 46875
#> [15,] 5 5 5 5 5 4 4 50000
#> [16,] 5 5 5 5 4 4 4 40000
#### General subset sum problem
set.seed(516781810)
comboGeneral(runif(100, 0, 42), 5, constraintFun = "mean",
comparisonFun = "==", limitConstraints = 30,
tolerance = 0.0000002)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 7.554598 31.01344 34.7431 37.39019 39.29867
#### Integer Partitions
comboGeneral(0:5, 5, TRUE, constraintFun = "sum",
comparisonFun = "==", limitConstraints = 5)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 0 0 0 5
#> [2,] 0 0 0 1 4
#> [3,] 0 0 0 2 3
#> [4,] 0 0 1 1 3
#> [5,] 0 0 1 2 2
#> [6,] 0 1 1 1 2
#> [7,] 1 1 1 1 1
## Using FUN
comboGeneral(10000, 5, lower = 20, upper = 22,
FUN = function(x) {
which(cummax(x) %% 2 == 1)
})
#> [[1]]
#> [1] 1 3
#>
#> [[2]]
#> [1] 1 3 5
#>
#> [[3]]
#> [1] 1 3
#>
if (FALSE) {
## Parallel example generating more than 2^31 - 1 combinations.
library(parallel)
numCores = detectCores() - 1
## 10086780 evenly divides choose(35, 15) and is "small enough" to
## generate quickly in chunks.
system.time(mclapply(seq(1, comboCount(35, 15), 10086780), function(x) {
a = comboGeneral(35, 15, lower = x, upper = x + 10086779)
## do something
x
}, mc.cores = numCores))
## Find 13-tuple combinations of 1:25 such
## that the mean is less than 10
system.time(myComb <- comboGeneral(25, 13, FALSE,
constraintFun = "mean",
comparisonFun = "<",
limitConstraints = 10))
## Alternatively, you must generate all combinations and subsequently
## subset to obtain the combinations that meet the criteria
system.time(myComb2 <- combn(25, 13))
system.time(myCols <- which(colMeans(myComb2) < 10))
system.time(myComb2 <- myComb2[, myCols])
## Any variation is much slower
system.time(myComb2 <- combn(25, 13)[,combn(25, 13, mean) < 10])
## Test equality with myComb above
all.equal(myComb, t(myComb2))
## much faster using Parallel = TRUE
system.time(permuteGeneral(15, 8, lower = 250000000, Parallel = TRUE))
system.time(permuteGeneral(15, 8, lower = 250000000))
system.time(comboGeneral(30, 10, Parallel = TRUE))
system.time(comboGeneral(30, 10))
## Parallel also works when applying constraintFun solely
system.time(comboGeneral(30, 10, Parallel = TRUE, constraintFun = "sum"))
system.time(comboGeneral(30, 10, constraintFun = "sum"))
## Depending on # of cores available, using Parallel with
## constraintFun is faster than rowSums or rowMeans alone
combs = comboGeneral(30, 10)
system.time(rowSums(combs))
## Fun example... see stackoverflow:
## https://stackoverflow.com/q/22218640/4408538
system.time(permuteGeneral(seq(0L,100L,10L),8,TRUE,
constraintFun = "sum",
comparisonFun = "==",
limitConstraints = 100))
}
```